You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@commons.apache.org by er...@apache.org on 2020/06/28 22:57:41 UTC

[commons-math] branch master updated (0d937ab -> 849d551)

This is an automated email from the ASF dual-hosted git repository.

erans pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/commons-math.git.


    from 0d937ab  Simplify "instanceof" usage.
     new 15dad0b  Use stream API in place of explicit loop.
     new 24e2c24  Formatting (unit test).
     new 960ba53  MATH-1547: Ranking of any number of the best matching units of a neural network.
     new 9cbf1d1  MATH-1547: Remove "findBest" and "findBestAndSecondBest" methods from "MapUtils".
     new 28e5b80  MATH-1548: Move standard quality measures of a SOM into class "NeuronSquareMesh2D".
     new ed4817c  MATH-1548: Remove methods redundant with functionality defined in "NeuronSquareMesh2D".
     new 824d92f  Condition does not apply.
     new 849d551  Track changes.

The 8 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 src/changes/changes.xml                            |   6 +
 .../commons/math4/fitting/AbstractCurveFitter.java |  17 +-
 ...entiatorVectorMultivariateJacobianFunction.java |   8 +-
 .../math4/ml/clustering/DBSCANClusterer.java       |  16 +-
 .../commons/math4/ml/neuralnet/MapRanking.java     | 156 +++++++++++++
 .../commons/math4/ml/neuralnet/MapUtils.java       | 242 +--------------------
 .../apache/commons/math4/ml/neuralnet/Network.java |  22 +-
 .../ml/neuralnet/sofm/KohonenUpdateAction.java     |   6 +-
 .../ml/neuralnet/twod/NeuronSquareMesh2D.java      | 240 ++++++++++++++++++++
 .../math4/ml/neuralnet/twod/util/HitHistogram.java |  84 -------
 .../ml/neuralnet/twod/util/QuantizationError.java  |  77 -------
 .../neuralnet/twod/util/SmoothedDataHistogram.java |  10 +-
 .../twod/util/TopographicErrorHistogram.java       |  92 --------
 .../neuralnet/twod/util/UnifiedDistanceMatrix.java |  81 ++-----
 .../org/apache/commons/math4/stat/StatUtils.java   |   7 +-
 .../{MapUtilsTest.java => MapRankingTest.java}     |  34 ++-
 .../ml/neuralnet/sofm/KohonenUpdateActionTest.java |   7 +-
 .../ml/neuralnet/twod/NeuronSquareMesh2DTest.java  |  58 ++++-
 18 files changed, 522 insertions(+), 641 deletions(-)
 create mode 100644 src/main/java/org/apache/commons/math4/ml/neuralnet/MapRanking.java
 delete mode 100644 src/main/java/org/apache/commons/math4/ml/neuralnet/twod/util/HitHistogram.java
 delete mode 100644 src/main/java/org/apache/commons/math4/ml/neuralnet/twod/util/QuantizationError.java
 delete mode 100644 src/main/java/org/apache/commons/math4/ml/neuralnet/twod/util/TopographicErrorHistogram.java
 rename src/test/java/org/apache/commons/math4/ml/neuralnet/{MapUtilsTest.java => MapRankingTest.java} (74%)


[commons-math] 02/08: Formatting (unit test).

Posted by er...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

erans pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/commons-math.git

commit 24e2c246ce165d3bb058ad9ca4d81a2776c9c3cf
Author: Gilles Sadowski <gi...@gmail.com>
AuthorDate: Thu Jun 25 17:58:28 2020 +0200

    Formatting (unit test).
---
 .../ml/neuralnet/twod/NeuronSquareMesh2DTest.java      | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/src/test/java/org/apache/commons/math4/ml/neuralnet/twod/NeuronSquareMesh2DTest.java b/src/test/java/org/apache/commons/math4/ml/neuralnet/twod/NeuronSquareMesh2DTest.java
index 1c58b93..693f59b 100644
--- a/src/test/java/org/apache/commons/math4/ml/neuralnet/twod/NeuronSquareMesh2DTest.java
+++ b/src/test/java/org/apache/commons/math4/ml/neuralnet/twod/NeuronSquareMesh2DTest.java
@@ -220,9 +220,9 @@ public class NeuronSquareMesh2DTest {
     public void test3x2CylinderNetwork2() {
         final FeatureInitializer[] initArray = { init };
         final Network net = new NeuronSquareMesh2D(2, false,
-                                             3, true,
-                                             SquareNeighbourhood.MOORE,
-                                             initArray).getNetwork();
+                                                   3, true,
+                                                   SquareNeighbourhood.MOORE,
+                                                   initArray).getNetwork();
         Collection<Neuron> neighbours;
 
         // All neurons.
@@ -345,9 +345,9 @@ public class NeuronSquareMesh2DTest {
     public void test3x3TorusNetwork2() {
         final FeatureInitializer[] initArray = { init };
         final Network net = new NeuronSquareMesh2D(3, true,
-                                             3, true,
-                                             SquareNeighbourhood.MOORE,
-                                             initArray).getNetwork();
+                                                   3, true,
+                                                   SquareNeighbourhood.MOORE,
+                                                   initArray).getNetwork();
         Collection<Neuron> neighbours;
 
         // All neurons.
@@ -569,9 +569,9 @@ public class NeuronSquareMesh2DTest {
     public void testConcentricNeighbourhood() {
         final FeatureInitializer[] initArray = { init };
         final Network net = new NeuronSquareMesh2D(5, true,
-                                             5, true,
-                                             SquareNeighbourhood.VON_NEUMANN,
-                                             initArray).getNetwork();
+                                                   5, true,
+                                                   SquareNeighbourhood.VON_NEUMANN,
+                                                   initArray).getNetwork();
 
         Collection<Neuron> neighbours;
         Collection<Neuron> exclude = new HashSet<>();


[commons-math] 01/08: Use stream API in place of explicit loop.

Posted by er...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

erans pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/commons-math.git

commit 15dad0b04706eb4fd78b500f8c99d011235c6882
Author: PeterAlfredLee <pe...@gmail.com>
AuthorDate: Wed Jun 24 17:08:29 2020 +0800

    Use stream API in place of explicit loop.
    
    Closes #154.
---
 .../commons/math4/fitting/AbstractCurveFitter.java | 17 +++--------------
 ...entiatorVectorMultivariateJacobianFunction.java |  8 +++-----
 .../math4/ml/clustering/DBSCANClusterer.java       | 16 ++++------------
 .../apache/commons/math4/ml/neuralnet/Network.java | 22 +++++-----------------
 .../org/apache/commons/math4/stat/StatUtils.java   |  7 +------
 5 files changed, 16 insertions(+), 54 deletions(-)

diff --git a/src/main/java/org/apache/commons/math4/fitting/AbstractCurveFitter.java b/src/main/java/org/apache/commons/math4/fitting/AbstractCurveFitter.java
index 32aebc7..7290ca7 100644
--- a/src/main/java/org/apache/commons/math4/fitting/AbstractCurveFitter.java
+++ b/src/main/java/org/apache/commons/math4/fitting/AbstractCurveFitter.java
@@ -16,6 +16,7 @@
  */
 package org.apache.commons.math4.fitting;
 
+import java.util.Arrays;
 import java.util.Collection;
 
 import org.apache.commons.math4.analysis.MultivariateMatrixFunction;
@@ -101,13 +102,7 @@ public abstract class AbstractCurveFitter {
         public TheoreticalValuesFunction(final ParametricUnivariateFunction f,
                                          final Collection<WeightedObservedPoint> observations) {
             this.f = f;
-
-            final int len = observations.size();
-            this.points = new double[len];
-            int i = 0;
-            for (WeightedObservedPoint obs : observations) {
-                this.points[i++] = obs.getX();
-            }
+            this.points = observations.stream().mapToDouble(WeightedObservedPoint::getX).toArray();
         }
 
         /**
@@ -118,13 +113,7 @@ public abstract class AbstractCurveFitter {
                 /** {@inheritDoc} */
                 @Override
                 public double[] value(double[] p) {
-                    final int len = points.length;
-                    final double[] values = new double[len];
-                    for (int i = 0; i < len; i++) {
-                        values[i] = f.value(points[i], p);
-                    }
-
-                    return values;
+                    return Arrays.stream(points).map(point -> f.value(point, p)).toArray();
                 }
             };
         }
diff --git a/src/main/java/org/apache/commons/math4/fitting/leastsquares/DifferentiatorVectorMultivariateJacobianFunction.java b/src/main/java/org/apache/commons/math4/fitting/leastsquares/DifferentiatorVectorMultivariateJacobianFunction.java
index bc3207f..c912ee5 100644
--- a/src/main/java/org/apache/commons/math4/fitting/leastsquares/DifferentiatorVectorMultivariateJacobianFunction.java
+++ b/src/main/java/org/apache/commons/math4/fitting/leastsquares/DifferentiatorVectorMultivariateJacobianFunction.java
@@ -26,6 +26,8 @@ import org.apache.commons.math4.linear.RealMatrix;
 import org.apache.commons.math4.linear.RealVector;
 import org.apache.commons.math4.util.Pair;
 
+import java.util.Arrays;
+
 /**
  * A MultivariateJacobianFunction (a thing that requires a derivative)
  * combined with the thing that can find derivatives.
@@ -88,10 +90,6 @@ public class DifferentiatorVectorMultivariateJacobianFunction implements Multiva
         DerivativeStructure[] derivatives = differentiator
                 .differentiate(univariateVectorFunction)
                 .value(new DerivativeStructure(1, 1, 0, atParameterValue));
-        double[] derivativesOut = new double[derivatives.length];
-        for(int index=0;index<derivatives.length;index++) {
-            derivativesOut[index] = derivatives[index].getPartialDerivative(1);
-        }
-        return derivativesOut;
+        return Arrays.stream(derivatives).mapToDouble(derivative -> derivative.getPartialDerivative(1)).toArray();
     }
 }
diff --git a/src/main/java/org/apache/commons/math4/ml/clustering/DBSCANClusterer.java b/src/main/java/org/apache/commons/math4/ml/clustering/DBSCANClusterer.java
index e065eeb..5a92422 100644
--- a/src/main/java/org/apache/commons/math4/ml/clustering/DBSCANClusterer.java
+++ b/src/main/java/org/apache/commons/math4/ml/clustering/DBSCANClusterer.java
@@ -23,6 +23,7 @@ import java.util.HashSet;
 import java.util.List;
 import java.util.Map;
 import java.util.Set;
+import java.util.stream.Collectors;
 
 import org.apache.commons.math4.exception.NotPositiveException;
 import org.apache.commons.math4.ml.distance.DistanceMeasure;
@@ -200,13 +201,8 @@ public class DBSCANClusterer<T extends Clusterable> extends Clusterer<T> {
      * @return the List of neighbors
      */
     private List<T> getNeighbors(final T point, final Collection<T> points) {
-        final List<T> neighbors = new ArrayList<>();
-        for (final T neighbor : points) {
-            if (point != neighbor && distance(neighbor, point) <= eps) {
-                neighbors.add(neighbor);
-            }
-        }
-        return neighbors;
+        return points.stream().filter(neighbor -> point != neighbor && distance(neighbor, point) <= eps)
+                              .collect(Collectors.toList());
     }
 
     /**
@@ -218,11 +214,7 @@ public class DBSCANClusterer<T extends Clusterable> extends Clusterer<T> {
      */
     private List<T> merge(final List<T> one, final List<T> two) {
         final Set<T> oneSet = new HashSet<>(one);
-        for (T item : two) {
-            if (!oneSet.contains(item)) {
-                one.add(item);
-            }
-        }
+        two.stream().filter(item -> !oneSet.contains(item)).forEach(one::add);
         return one;
     }
 }
diff --git a/src/main/java/org/apache/commons/math4/ml/neuralnet/Network.java b/src/main/java/org/apache/commons/math4/ml/neuralnet/Network.java
index 6da635e..040963b 100644
--- a/src/main/java/org/apache/commons/math4/ml/neuralnet/Network.java
+++ b/src/main/java/org/apache/commons/math4/ml/neuralnet/Network.java
@@ -31,6 +31,7 @@ import java.util.Collections;
 import java.util.Map;
 import java.util.concurrent.ConcurrentHashMap;
 import java.util.concurrent.atomic.AtomicLong;
+import java.util.stream.Collectors;
 
 import org.apache.commons.math4.exception.DimensionMismatchException;
 import org.apache.commons.math4.exception.MathIllegalStateException;
@@ -216,12 +217,8 @@ public class Network
      * this network.
      */
     public void deleteNeuron(Neuron neuron) {
-        final Collection<Neuron> neighbours = getNeighbours(neuron);
-
         // Delete links to from neighbours.
-        for (Neuron n : neighbours) {
-            deleteLink(n, neuron);
-        }
+        getNeighbours(neuron).forEach(neighbour -> deleteLink(neighbour, neuron));
 
         // Remove neuron.
         neuronMap.remove(neuron.getIdentifier());
@@ -357,22 +354,13 @@ public class Network
     public Collection<Neuron> getNeighbours(Iterable<Neuron> neurons,
                                             Iterable<Neuron> exclude) {
         final Set<Long> idList = new HashSet<>();
+        neurons.forEach(n -> idList.addAll(linkMap.get(n.getIdentifier())));
 
-        for (Neuron n : neurons) {
-            idList.addAll(linkMap.get(n.getIdentifier()));
-        }
         if (exclude != null) {
-            for (Neuron n : exclude) {
-                idList.remove(n.getIdentifier());
-            }
+            exclude.forEach(n -> idList.remove(n.getIdentifier()));
         }
 
-        final List<Neuron> neuronList = new ArrayList<>();
-        for (Long id : idList) {
-            neuronList.add(getNeuron(id));
-        }
-
-        return neuronList;
+        return idList.stream().map(this::getNeuron).collect(Collectors.toList());
     }
 
     /**
diff --git a/src/main/java/org/apache/commons/math4/stat/StatUtils.java b/src/main/java/org/apache/commons/math4/stat/StatUtils.java
index 0ab56c7..e736741 100644
--- a/src/main/java/org/apache/commons/math4/stat/StatUtils.java
+++ b/src/main/java/org/apache/commons/math4/stat/StatUtils.java
@@ -850,12 +850,7 @@ public final class StatUtils {
         }
         List<Double> list = freq.getMode();
         // Convert the list to an array of primitive double
-        double[] modes = new double[list.size()];
-        int i = 0;
-        for(Double c : list) {
-            modes[i++] = c.doubleValue();
-        }
-        return modes;
+        return list.stream().mapToDouble(Double::doubleValue).toArray();
     }
 
 }


[commons-math] 03/08: MATH-1547: Ranking of any number of the best matching units of a neural network.

Posted by er...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

erans pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/commons-math.git

commit 960ba5322beb3b0c26b71fb06b0e3131122a8a0d
Author: Gilles Sadowski <gi...@gmail.com>
AuthorDate: Fri Jun 26 15:32:02 2020 +0200

    MATH-1547: Ranking of any number of the best matching units of a neural network.
---
 src/changes/changes.xml                            |   3 +
 .../commons/math4/ml/neuralnet/MapRanking.java     | 156 +++++++++++++++++++++
 .../commons/math4/ml/neuralnet/MapUtils.java       |  90 +-----------
 .../commons/math4/ml/neuralnet/MapRankingTest.java | 108 ++++++++++++++
 4 files changed, 271 insertions(+), 86 deletions(-)

diff --git a/src/changes/changes.xml b/src/changes/changes.xml
index 2d82bf0..46e5b45 100644
--- a/src/changes/changes.xml
+++ b/src/changes/changes.xml
@@ -54,6 +54,9 @@ If the output is not quite correct, check for invisible trailing spaces!
     </release>
 
     <release version="4.0" date="XXXX-XX-XX" description="">
+      <action dev="erans" type="update" issue="MATH-1547">
+        More flexible ranking of SOFM.
+      </action>
       <action dev="erans" type="fix" issue="MATH-1537" due-to="Jin Xu">
         Clean-up (typos and unused "import" statements).
       </action>
diff --git a/src/main/java/org/apache/commons/math4/ml/neuralnet/MapRanking.java b/src/main/java/org/apache/commons/math4/ml/neuralnet/MapRanking.java
new file mode 100644
index 0000000..4994e8c
--- /dev/null
+++ b/src/main/java/org/apache/commons/math4/ml/neuralnet/MapRanking.java
@@ -0,0 +1,156 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.commons.math4.ml.neuralnet;
+
+import java.util.List;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.Comparator;
+
+import org.apache.commons.math4.exception.NotStrictlyPositiveException;
+import org.apache.commons.math4.ml.distance.DistanceMeasure;
+
+/**
+ * Utility for ranking the units (neurons) of a network.
+ *
+ * @since 4.0
+ */
+public class MapRanking {
+    /** List corresponding to the map passed to the constructor. */
+    private final List<Neuron> map = new ArrayList<>();
+    /** Distance function for sorting. */
+    private final DistanceMeasure distance;
+
+    /**
+     * @param neurons List to be ranked.
+     * No defensive copy is performed.
+     * The {@link #rank(double[],int) created list of units} will
+     * be sorted in increasing order of the {@code distance}.
+     * @param distance Distance function.
+     */
+    public MapRanking(Iterable<Neuron> neurons,
+                      DistanceMeasure distance) {
+        this.distance = distance;
+
+        for (Neuron n : neurons) {
+            map.add(n); // No defensive copy.
+        }
+    }
+
+    /**
+     * Creates a list of the neurons whose features best correspond to the
+     * given {@code features}.
+     *
+     * @param features Data.
+     * @return the list of neurons sorted in decreasing order of distance to
+     * the given data.
+     * @throws org.apache.commons.math4.exception.DimensionMismatchException
+     * if the size of the input is not compatible with the neurons features
+     * size.
+     */
+    public List<Neuron> rank(double[] features) {
+        return rank(features, map.size());
+    }
+
+    /**
+     * Creates a list of the neurons whose features best correspond to the
+     * given {@code features}.
+     *
+     * @param features Data.
+     * @param max Maximum size of the returned list.
+     * @return the list of neurons sorted in decreasing order of distance to
+     * the given data.
+     * @throws org.apache.commons.math4.exception.DimensionMismatchException
+     * if the size of the input is not compatible with the neurons features
+     * size.
+     * @throws NotStrictlyPositiveException if {@code max <= 0}.
+     */
+    public List<Neuron> rank(double[] features,
+                             int max) {
+        if (max <= 0) {
+            throw new NotStrictlyPositiveException(max);
+        }
+        final int m = max <= map.size() ?
+            max :
+            map.size();
+        final List<PairNeuronDouble> list = new ArrayList<>(m);
+
+        for (final Neuron n : map) {
+            final double d = distance.compute(n.getFeatures(), features);
+            final PairNeuronDouble p = new PairNeuronDouble(n, d);
+
+            if (list.size() < m) {
+                list.add(p);
+                if (list.size() > 1) {
+                    // Sort if there is more than 1 element.
+                    Collections.sort(list, PairNeuronDouble.COMPARATOR);
+                }
+            } else {
+                final int last = list.size() - 1;
+                if (PairNeuronDouble.COMPARATOR.compare(p, list.get(last)) < 0) {
+                    list.set(last, p); // Replace worst entry.
+                    if (last > 0) {
+                        // Sort if there is more than 1 element.
+                        Collections.sort(list, PairNeuronDouble.COMPARATOR);
+                    }
+                }
+            }
+        }
+
+        final List<Neuron> result = new ArrayList<>(m);
+        for (PairNeuronDouble p : list) {
+            result.add(p.getNeuron());
+        }
+
+        return result;
+    }
+
+    /**
+     * Helper data structure holding a (Neuron, double) pair.
+     */
+    private static class PairNeuronDouble {
+        /** Comparator. */
+        static final Comparator<PairNeuronDouble> COMPARATOR
+            = new Comparator<PairNeuronDouble>() {
+            /** {@inheritDoc} */
+            @Override
+            public int compare(PairNeuronDouble o1,
+                               PairNeuronDouble o2) {
+                return Double.compare(o1.value, o2.value);
+            }
+        };
+        /** Key. */
+        private final Neuron neuron;
+        /** Value. */
+        private final double value;
+
+        /**
+         * @param neuron Neuron.
+         * @param value Value.
+         */
+        PairNeuronDouble(Neuron neuron, double value) {
+            this.neuron = neuron;
+            this.value = value;
+        }
+
+        /** @return the neuron. */
+        public Neuron getNeuron() {
+            return neuron;
+        }
+    }
+}
diff --git a/src/main/java/org/apache/commons/math4/ml/neuralnet/MapUtils.java b/src/main/java/org/apache/commons/math4/ml/neuralnet/MapUtils.java
index a793fa0..4b1eb20 100644
--- a/src/main/java/org/apache/commons/math4/ml/neuralnet/MapUtils.java
+++ b/src/main/java/org/apache/commons/math4/ml/neuralnet/MapUtils.java
@@ -17,12 +17,9 @@
 
 package org.apache.commons.math4.ml.neuralnet;
 
-import java.util.ArrayList;
 import java.util.Collection;
-import java.util.Collections;
 import java.util.HashMap;
 import java.util.List;
-import java.util.Comparator;
 
 import org.apache.commons.math4.exception.NoDataException;
 import org.apache.commons.math4.ml.distance.DistanceMeasure;
@@ -56,17 +53,7 @@ public class MapUtils {
     public static Neuron findBest(double[] features,
                                   Iterable<Neuron> neurons,
                                   DistanceMeasure distance) {
-        Neuron best = null;
-        double min = Double.POSITIVE_INFINITY;
-        for (final Neuron n : neurons) {
-            final double d = distance.compute(n.getFeatures(), features);
-            if (d < min) {
-                min = d;
-                best = n;
-            }
-        }
-
-        return best;
+        return new MapRanking(neurons, distance).rank(features, 1).get(0);
     }
 
     /**
@@ -85,27 +72,8 @@ public class MapUtils {
     public static Pair<Neuron, Neuron> findBestAndSecondBest(double[] features,
                                                              Iterable<Neuron> neurons,
                                                              DistanceMeasure distance) {
-        Neuron[] best = { null, null };
-        double[] min = { Double.POSITIVE_INFINITY,
-                         Double.POSITIVE_INFINITY };
-        for (final Neuron n : neurons) {
-            final double d = distance.compute(n.getFeatures(), features);
-            if (d < min[0]) {
-                // Replace second best with old best.
-                min[1] = min[0];
-                best[1] = best[0];
-
-                // Store current as new best.
-                min[0] = d;
-                best[0] = n;
-            } else if (d < min[1]) {
-                // Replace old second best with current.
-                min[1] = d;
-                best[1] = n;
-            }
-        }
-
-        return new Pair<>(best[0], best[1]);
+        final List<Neuron> list = new MapRanking(neurons, distance).rank(features, 2);
+        return new Pair<>(list.get(0), list.get(1));
     }
 
     /**
@@ -130,22 +98,7 @@ public class MapUtils {
     public static Neuron[] sort(double[] features,
                                 Iterable<Neuron> neurons,
                                 DistanceMeasure distance) {
-        final List<PairNeuronDouble> list = new ArrayList<>();
-
-        for (final Neuron n : neurons) {
-            final double d = distance.compute(n.getFeatures(), features);
-            list.add(new PairNeuronDouble(n, d));
-        }
-
-        Collections.sort(list, PairNeuronDouble.COMPARATOR);
-
-        final int len = list.size();
-        final Neuron[] sorted = new Neuron[len];
-
-        for (int i = 0; i < len; i++) {
-            sorted[i] = list.get(i).getNeuron();
-        }
-        return sorted;
+        return new MapRanking(neurons, distance).rank(features).toArray(new Neuron[0]);
     }
 
     /**
@@ -289,39 +242,4 @@ public class MapUtils {
 
         return ((double) notAdjacentCount) / count;
     }
-
-    /**
-     * Helper data structure holding a (Neuron, double) pair.
-     */
-    private static class PairNeuronDouble {
-        /** Comparator. */
-        static final Comparator<PairNeuronDouble> COMPARATOR
-            = new Comparator<PairNeuronDouble>() {
-            /** {@inheritDoc} */
-            @Override
-            public int compare(PairNeuronDouble o1,
-                               PairNeuronDouble o2) {
-                return Double.compare(o1.value, o2.value);
-            }
-        };
-        /** Key. */
-        private final Neuron neuron;
-        /** Value. */
-        private final double value;
-
-        /**
-         * @param neuron Neuron.
-         * @param value Value.
-         */
-        PairNeuronDouble(Neuron neuron, double value) {
-            this.neuron = neuron;
-            this.value = value;
-        }
-
-        /** @return the neuron. */
-        public Neuron getNeuron() {
-            return neuron;
-        }
-
-    }
 }
diff --git a/src/test/java/org/apache/commons/math4/ml/neuralnet/MapRankingTest.java b/src/test/java/org/apache/commons/math4/ml/neuralnet/MapRankingTest.java
new file mode 100644
index 0000000..d835088
--- /dev/null
+++ b/src/test/java/org/apache/commons/math4/ml/neuralnet/MapRankingTest.java
@@ -0,0 +1,108 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.commons.math4.ml.neuralnet;
+
+import java.util.Set;
+import java.util.HashSet;
+
+import org.apache.commons.math4.exception.NotStrictlyPositiveException;
+import org.apache.commons.math4.ml.distance.DistanceMeasure;
+import org.apache.commons.math4.ml.distance.EuclideanDistance;
+import org.apache.commons.math4.ml.neuralnet.FeatureInitializer;
+import org.apache.commons.math4.ml.neuralnet.FeatureInitializerFactory;
+import org.apache.commons.math4.ml.neuralnet.MapUtils;
+import org.apache.commons.math4.ml.neuralnet.Network;
+import org.apache.commons.math4.ml.neuralnet.Neuron;
+import org.apache.commons.math4.ml.neuralnet.oned.NeuronString;
+import org.junit.Test;
+import org.junit.Assert;
+
+/**
+ * Tests for {@link MapRanking} class.
+ */
+public class MapRankingTest {
+    /*
+     * Test assumes that the network is
+     *
+     *  0-----1-----2
+     */
+    @Test
+    public void testFindClosestNeuron() {
+        final FeatureInitializer init
+            = new OffsetFeatureInitializer(FeatureInitializerFactory.uniform(-0.1, 0.1));
+        final FeatureInitializer[] initArray = { init };
+
+        final MapRanking ranking = new MapRanking(new NeuronString(3, false, initArray).getNetwork(),
+                                                  new EuclideanDistance());
+
+        final Set<Neuron> allBest = new HashSet<>();
+        final Set<Neuron> best = new HashSet<>();
+        double[][] features;
+
+        // The following tests ensures that
+        // 1. the same neuron is always selected when the input feature is
+        //    in the range of the initializer,
+        // 2. different network's neuron have been selected by inputs features
+        //    that belong to different ranges.
+
+        best.clear();
+        features = new double[][] {
+            { -1 },
+            { 0.4 },
+        };
+        for (double[] f : features) {
+            best.addAll(ranking.rank(f, 1));
+        }
+        Assert.assertEquals(1, best.size());
+        allBest.addAll(best);
+
+        best.clear();
+        features = new double[][] {
+            { 0.6 },
+            { 1.4 },
+        };
+        for (double[] f : features) {
+            best.addAll(ranking.rank(f, 1));
+        }
+        Assert.assertEquals(1, best.size());
+        allBest.addAll(best);
+
+        best.clear();
+        features = new double[][] {
+            { 1.6 },
+            { 3 },
+        };
+        for (double[] f : features) {
+            best.addAll(ranking.rank(f, 1));
+        }
+        Assert.assertEquals(1, best.size());
+        allBest.addAll(best);
+
+        Assert.assertEquals(3, allBest.size());
+    }
+
+    @Test(expected=NotStrictlyPositiveException.class)
+    public void testRankPrecondition() {
+        final FeatureInitializer init
+            = new OffsetFeatureInitializer(FeatureInitializerFactory.uniform(-0.1, 0.1));
+        final FeatureInitializer[] initArray = { init };
+
+        new MapRanking(new NeuronString(3, false, initArray).getNetwork(),
+                       new EuclideanDistance()).rank(new double[] { -1 }, 0);
+    }
+}


[commons-math] 06/08: MATH-1548: Remove methods redundant with functionality defined in "NeuronSquareMesh2D".

Posted by er...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

erans pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/commons-math.git

commit ed4817c7301a942ac17d9a015a90ad8e406dc0e5
Author: Gilles Sadowski <gi...@gmail.com>
AuthorDate: Fri Jun 26 18:29:37 2020 +0200

    MATH-1548: Remove methods redundant with functionality defined in "NeuronSquareMesh2D".
---
 .../commons/math4/ml/neuralnet/MapUtils.java       | 85 ----------------------
 1 file changed, 85 deletions(-)

diff --git a/src/main/java/org/apache/commons/math4/ml/neuralnet/MapUtils.java b/src/main/java/org/apache/commons/math4/ml/neuralnet/MapUtils.java
index 40500b6..b6065bd 100644
--- a/src/main/java/org/apache/commons/math4/ml/neuralnet/MapUtils.java
+++ b/src/main/java/org/apache/commons/math4/ml/neuralnet/MapUtils.java
@@ -17,13 +17,9 @@
 
 package org.apache.commons.math4.ml.neuralnet;
 
-import java.util.Collection;
-import java.util.HashMap;
 import java.util.List;
-
 import org.apache.commons.math4.exception.NoDataException;
 import org.apache.commons.math4.ml.distance.DistanceMeasure;
-import org.apache.commons.math4.ml.neuralnet.twod.NeuronSquareMesh2D;
 
 /**
  * Utilities for network maps.
@@ -37,87 +33,6 @@ public class MapUtils {
     private MapUtils() {}
 
     /**
-     * Computes the <a href="http://en.wikipedia.org/wiki/U-Matrix">
-     *  U-matrix</a> of a two-dimensional map.
-     *
-     * @param map Network.
-     * @param distance Function to use for computing the average
-     * distance from a neuron to its neighbours.
-     * @return the matrix of average distances.
-     */
-    public static double[][] computeU(NeuronSquareMesh2D map,
-                                      DistanceMeasure distance) {
-        final int numRows = map.getNumberOfRows();
-        final int numCols = map.getNumberOfColumns();
-        final double[][] uMatrix = new double[numRows][numCols];
-
-        final Network net = map.getNetwork();
-
-        for (int i = 0; i < numRows; i++) {
-            for (int j = 0; j < numCols; j++) {
-                final Neuron neuron = map.getNeuron(i, j);
-                final Collection<Neuron> neighbours = net.getNeighbours(neuron);
-                final double[] features = neuron.getFeatures();
-
-                double d = 0;
-                int count = 0;
-                for (Neuron n : neighbours) {
-                    ++count;
-                    d += distance.compute(features, n.getFeatures());
-                }
-
-                uMatrix[i][j] = d / count;
-            }
-        }
-
-        return uMatrix;
-    }
-
-    /**
-     * Computes the "hit" histogram of a two-dimensional map.
-     *
-     * @param data Feature vectors.
-     * @param map Network.
-     * @param distance Function to use for determining the best matching unit.
-     * @return the number of hits for each neuron in the map.
-     */
-    public static int[][] computeHitHistogram(Iterable<double[]> data,
-                                              NeuronSquareMesh2D map,
-                                              DistanceMeasure distance) {
-        final HashMap<Neuron, Integer> hit = new HashMap<>();
-        final MapRanking rank = new MapRanking(map.getNetwork(), distance);
-
-        for (double[] f : data) {
-            final Neuron best = rank.rank(f, 1).get(0);
-            final Integer count = hit.get(best);
-            if (count == null) {
-                hit.put(best, 1);
-            } else {
-                hit.put(best, count + 1);
-            }
-        }
-
-        // Copy the histogram data into a 2D map.
-        final int numRows = map.getNumberOfRows();
-        final int numCols = map.getNumberOfColumns();
-        final int[][] histo = new int[numRows][numCols];
-
-        for (int i = 0; i < numRows; i++) {
-            for (int j = 0; j < numCols; j++) {
-                final Neuron neuron = map.getNeuron(i, j);
-                final Integer count = hit.get(neuron);
-                if (count == null) {
-                    histo[i][j] = 0;
-                } else {
-                    histo[i][j] = count;
-                }
-            }
-        }
-
-        return histo;
-    }
-
-    /**
      * Computes the quantization error.
      * The quantization error is the average distance between a feature vector
      * and its "best matching unit" (closest neuron).


[commons-math] 04/08: MATH-1547: Remove "findBest" and "findBestAndSecondBest" methods from "MapUtils".

Posted by er...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

erans pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/commons-math.git

commit 9cbf1d184442063ec5ab833e954009b7f18c2781
Author: Gilles Sadowski <gi...@gmail.com>
AuthorDate: Fri Jun 26 17:49:21 2020 +0200

    MATH-1547: Remove "findBest" and "findBestAndSecondBest" methods from "MapUtils".
    
    Use equivalent functionality from class "MapRanking".
---
 .../commons/math4/ml/neuralnet/MapUtils.java       |  79 ++------------
 .../ml/neuralnet/sofm/KohonenUpdateAction.java     |   6 +-
 .../math4/ml/neuralnet/twod/util/HitHistogram.java |   5 +-
 .../ml/neuralnet/twod/util/QuantizationError.java  |   5 +-
 .../neuralnet/twod/util/SmoothedDataHistogram.java |  10 +-
 .../twod/util/TopographicErrorHistogram.java       |  13 +--
 .../commons/math4/ml/neuralnet/MapRankingTest.java |  19 +++-
 .../commons/math4/ml/neuralnet/MapUtilsTest.java   | 115 ---------------------
 .../ml/neuralnet/sofm/KohonenUpdateActionTest.java |   7 +-
 9 files changed, 53 insertions(+), 206 deletions(-)

diff --git a/src/main/java/org/apache/commons/math4/ml/neuralnet/MapUtils.java b/src/main/java/org/apache/commons/math4/ml/neuralnet/MapUtils.java
index 4b1eb20..40500b6 100644
--- a/src/main/java/org/apache/commons/math4/ml/neuralnet/MapUtils.java
+++ b/src/main/java/org/apache/commons/math4/ml/neuralnet/MapUtils.java
@@ -24,7 +24,6 @@ import java.util.List;
 import org.apache.commons.math4.exception.NoDataException;
 import org.apache.commons.math4.ml.distance.DistanceMeasure;
 import org.apache.commons.math4.ml.neuralnet.twod.NeuronSquareMesh2D;
-import org.apache.commons.math4.util.Pair;
 
 /**
  * Utilities for network maps.
@@ -38,70 +37,6 @@ public class MapUtils {
     private MapUtils() {}
 
     /**
-     * Finds the neuron that best matches the given features.
-     *
-     * @param features Data.
-     * @param neurons List of neurons to scan. If the list is empty
-     * {@code null} will be returned.
-     * @param distance Distance function. The neuron's features are
-     * passed as the first argument to {@link DistanceMeasure#compute(double[],double[])}.
-     * @return the neuron whose features are closest to the given data.
-     * @throws org.apache.commons.math4.exception.DimensionMismatchException
-     * if the size of the input is not compatible with the neurons features
-     * size.
-     */
-    public static Neuron findBest(double[] features,
-                                  Iterable<Neuron> neurons,
-                                  DistanceMeasure distance) {
-        return new MapRanking(neurons, distance).rank(features, 1).get(0);
-    }
-
-    /**
-     * Finds the two neurons that best match the given features.
-     *
-     * @param features Data.
-     * @param neurons List of neurons to scan. If the list is empty
-     * {@code null} will be returned.
-     * @param distance Distance function. The neuron's features are
-     * passed as the first argument to {@link DistanceMeasure#compute(double[],double[])}.
-     * @return the two neurons whose features are closest to the given data.
-     * @throws org.apache.commons.math4.exception.DimensionMismatchException
-     * if the size of the input is not compatible with the neurons features
-     * size.
-     */
-    public static Pair<Neuron, Neuron> findBestAndSecondBest(double[] features,
-                                                             Iterable<Neuron> neurons,
-                                                             DistanceMeasure distance) {
-        final List<Neuron> list = new MapRanking(neurons, distance).rank(features, 2);
-        return new Pair<>(list.get(0), list.get(1));
-    }
-
-    /**
-     * Creates a list of neurons sorted in increased order of the distance
-     * to the given {@code features}.
-     *
-     * @param features Data.
-     * @param neurons List of neurons to scan. If it is empty, an empty array
-     * will be returned.
-     * @param distance Distance function.
-     * @return the neurons, sorted in increasing order of distance in data
-     * space.
-     * @throws org.apache.commons.math4.exception.DimensionMismatchException
-     * if the size of the input is not compatible with the neurons features
-     * size.
-     *
-     * @see #findBest(double[],Iterable,DistanceMeasure)
-     * @see #findBestAndSecondBest(double[],Iterable,DistanceMeasure)
-     *
-     * @since 3.6
-     */
-    public static Neuron[] sort(double[] features,
-                                Iterable<Neuron> neurons,
-                                DistanceMeasure distance) {
-        return new MapRanking(neurons, distance).rank(features).toArray(new Neuron[0]);
-    }
-
-    /**
      * Computes the <a href="http://en.wikipedia.org/wiki/U-Matrix">
      *  U-matrix</a> of a two-dimensional map.
      *
@@ -150,10 +85,10 @@ public class MapUtils {
                                               NeuronSquareMesh2D map,
                                               DistanceMeasure distance) {
         final HashMap<Neuron, Integer> hit = new HashMap<>();
-        final Network net = map.getNetwork();
+        final MapRanking rank = new MapRanking(map.getNetwork(), distance);
 
         for (double[] f : data) {
-            final Neuron best = findBest(f, net, distance);
+            final Neuron best = rank.rank(f, 1).get(0);
             final Integer count = hit.get(best);
             if (count == null) {
                 hit.put(best, 1);
@@ -196,11 +131,13 @@ public class MapUtils {
     public static double computeQuantizationError(Iterable<double[]> data,
                                                   Iterable<Neuron> neurons,
                                                   DistanceMeasure distance) {
+        final MapRanking rank = new MapRanking(neurons, distance);
+
         double d = 0;
         int count = 0;
         for (double[] f : data) {
             ++count;
-            d += distance.compute(f, findBest(f, neurons, distance).getFeatures());
+            d += distance.compute(f, rank.rank(f, 1).get(0).getFeatures());
         }
 
         if (count == 0) {
@@ -224,12 +161,14 @@ public class MapUtils {
     public static double computeTopographicError(Iterable<double[]> data,
                                                  Network net,
                                                  DistanceMeasure distance) {
+        final MapRanking rank = new MapRanking(net, distance);
+
         int notAdjacentCount = 0;
         int count = 0;
         for (double[] f : data) {
             ++count;
-            final Pair<Neuron, Neuron> p = findBestAndSecondBest(f, net, distance);
-            if (!net.getNeighbours(p.getFirst()).contains(p.getSecond())) {
+            final List<Neuron> p = rank.rank(f, 2);
+            if (!net.getNeighbours(p.get(0)).contains(p.get(1))) {
                 // Increment count if first and second best matching units
                 // are not neighbours.
                 ++notAdjacentCount;
diff --git a/src/main/java/org/apache/commons/math4/ml/neuralnet/sofm/KohonenUpdateAction.java b/src/main/java/org/apache/commons/math4/ml/neuralnet/sofm/KohonenUpdateAction.java
index 691fd90..162e69c 100644
--- a/src/main/java/org/apache/commons/math4/ml/neuralnet/sofm/KohonenUpdateAction.java
+++ b/src/main/java/org/apache/commons/math4/ml/neuralnet/sofm/KohonenUpdateAction.java
@@ -24,7 +24,7 @@ import java.util.concurrent.atomic.AtomicLong;
 import org.apache.commons.math4.analysis.function.Gaussian;
 import org.apache.commons.math4.linear.ArrayRealVector;
 import org.apache.commons.math4.ml.distance.DistanceMeasure;
-import org.apache.commons.math4.ml.neuralnet.MapUtils;
+import org.apache.commons.math4.ml.neuralnet.MapRanking;
 import org.apache.commons.math4.ml.neuralnet.Network;
 import org.apache.commons.math4.ml.neuralnet.Neuron;
 import org.apache.commons.math4.ml.neuralnet.UpdateAction;
@@ -194,8 +194,10 @@ public class KohonenUpdateAction implements UpdateAction {
     private Neuron findAndUpdateBestNeuron(Network net,
                                            double[] features,
                                            double learningRate) {
+        final MapRanking rank = new MapRanking(net, distance);
+
         while (true) {
-            final Neuron best = MapUtils.findBest(features, net, distance);
+            final Neuron best = rank.rank(features, 1).get(0);
 
             if (attemptNeuronUpdate(best, features, learningRate)) {
                 return best;
diff --git a/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/util/HitHistogram.java b/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/util/HitHistogram.java
index 98d253f..a88fa1d 100644
--- a/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/util/HitHistogram.java
+++ b/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/util/HitHistogram.java
@@ -17,7 +17,7 @@
 
 package org.apache.commons.math4.ml.neuralnet.twod.util;
 
-import org.apache.commons.math4.ml.neuralnet.MapUtils;
+import org.apache.commons.math4.ml.neuralnet.MapRanking;
 import org.apache.commons.math4.ml.neuralnet.Neuron;
 import org.apache.commons.math4.ml.neuralnet.twod.NeuronSquareMesh2D;
 import org.apache.commons.math4.ml.distance.DistanceMeasure;
@@ -54,6 +54,7 @@ public class HitHistogram implements MapDataVisualization {
         final int nC = map.getNumberOfColumns();
 
         final LocationFinder finder = new LocationFinder(map);
+        final MapRanking rank = new MapRanking(map.getNetwork(), distance);
 
         // Totla number of samples.
         int numSamples = 0;
@@ -61,7 +62,7 @@ public class HitHistogram implements MapDataVisualization {
         final double[][] hit = new double[nR][nC];
 
         for (double[] sample : data) {
-            final Neuron best = MapUtils.findBest(sample, map, distance);
+            final Neuron best = rank.rank(sample, 1).get(0);
 
             final LocationFinder.Location loc = finder.getLocation(best);
             final int row = loc.getRow();
diff --git a/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/util/QuantizationError.java b/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/util/QuantizationError.java
index 37b1c3d..f2bc9de 100644
--- a/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/util/QuantizationError.java
+++ b/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/util/QuantizationError.java
@@ -17,7 +17,7 @@
 
 package org.apache.commons.math4.ml.neuralnet.twod.util;
 
-import org.apache.commons.math4.ml.neuralnet.MapUtils;
+import org.apache.commons.math4.ml.neuralnet.MapRanking;
 import org.apache.commons.math4.ml.neuralnet.Neuron;
 import org.apache.commons.math4.ml.neuralnet.twod.NeuronSquareMesh2D;
 import org.apache.commons.math4.ml.distance.DistanceMeasure;
@@ -47,6 +47,7 @@ public class QuantizationError implements MapDataVisualization {
         final int nC = map.getNumberOfColumns();
 
         final LocationFinder finder = new LocationFinder(map);
+        final MapRanking rank = new MapRanking(map.getNetwork(), distance);
 
         // Hit bins.
         final int[][] hit = new int[nR][nC];
@@ -54,7 +55,7 @@ public class QuantizationError implements MapDataVisualization {
         final double[][] error = new double[nR][nC];
 
         for (double[] sample : data) {
-            final Neuron best = MapUtils.findBest(sample, map, distance);
+            final Neuron best = rank.rank(sample, 1).get(0);
 
             final LocationFinder.Location loc = finder.getLocation(best);
             final int row = loc.getRow();
diff --git a/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/util/SmoothedDataHistogram.java b/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/util/SmoothedDataHistogram.java
index 1327055..caca11f 100644
--- a/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/util/SmoothedDataHistogram.java
+++ b/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/util/SmoothedDataHistogram.java
@@ -17,7 +17,8 @@
 
 package org.apache.commons.math4.ml.neuralnet.twod.util;
 
-import org.apache.commons.math4.ml.neuralnet.MapUtils;
+import java.util.List;
+import org.apache.commons.math4.ml.neuralnet.MapRanking;
 import org.apache.commons.math4.ml.neuralnet.Neuron;
 import org.apache.commons.math4.ml.neuralnet.twod.NeuronSquareMesh2D;
 import org.apache.commons.math4.ml.distance.DistanceMeasure;
@@ -77,16 +78,15 @@ public class SmoothedDataHistogram implements MapDataVisualization {
         }
 
         final LocationFinder finder = new LocationFinder(map);
+        final MapRanking rank = new MapRanking(map.getNetwork(), distance);
 
         // Histogram bins.
         final double[][] histo = new double[nR][nC];
 
         for (double[] sample : data) {
-            final Neuron[] sorted = MapUtils.sort(sample,
-                                                  map.getNetwork(),
-                                                  distance);
+            final List<Neuron> sorted = rank.rank(sample);
             for (int i = 0; i < smoothingBins; i++) {
-                final LocationFinder.Location loc = finder.getLocation(sorted[i]);
+                final LocationFinder.Location loc = finder.getLocation(sorted.get(i));
                 final int row = loc.getRow();
                 final int col = loc.getColumn();
                 histo[row][col] += (smoothingBins - i) * membershipNormalization;
diff --git a/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/util/TopographicErrorHistogram.java b/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/util/TopographicErrorHistogram.java
index e3683c1..758e672 100644
--- a/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/util/TopographicErrorHistogram.java
+++ b/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/util/TopographicErrorHistogram.java
@@ -17,12 +17,12 @@
 
 package org.apache.commons.math4.ml.neuralnet.twod.util;
 
-import org.apache.commons.math4.ml.neuralnet.MapUtils;
+import java.util.List;
+import org.apache.commons.math4.ml.neuralnet.MapRanking;
 import org.apache.commons.math4.ml.neuralnet.Neuron;
 import org.apache.commons.math4.ml.neuralnet.Network;
 import org.apache.commons.math4.ml.neuralnet.twod.NeuronSquareMesh2D;
 import org.apache.commons.math4.ml.distance.DistanceMeasure;
-import org.apache.commons.math4.util.Pair;
 
 /**
  * Computes the topographic error histogram.
@@ -55,8 +55,9 @@ public class TopographicErrorHistogram implements MapDataVisualization {
         final int nR = map.getNumberOfRows();
         final int nC = map.getNumberOfColumns();
 
-        final Network net = map.getNetwork();
         final LocationFinder finder = new LocationFinder(map);
+        final Network net = map.getNetwork();
+        final MapRanking rank = new MapRanking(net, distance);
 
         // Hit bins.
         final int[][] hit = new int[nR][nC];
@@ -64,15 +65,15 @@ public class TopographicErrorHistogram implements MapDataVisualization {
         final double[][] error = new double[nR][nC];
 
         for (double[] sample : data) {
-            final Pair<Neuron, Neuron> p = MapUtils.findBestAndSecondBest(sample, map, distance);
-            final Neuron best = p.getFirst();
+            final List<Neuron> p = rank.rank(sample, 2);
+            final Neuron best = p.get(0);
 
             final LocationFinder.Location loc = finder.getLocation(best);
             final int row = loc.getRow();
             final int col = loc.getColumn();
             hit[row][col] += 1;
 
-            if (!net.getNeighbours(best).contains(p.getSecond())) {
+            if (!net.getNeighbours(best).contains(p.get(1))) {
                 // Increment count if first and second best matching units
                 // are not neighbours.
                 error[row][col] += 1;
diff --git a/src/test/java/org/apache/commons/math4/ml/neuralnet/MapRankingTest.java b/src/test/java/org/apache/commons/math4/ml/neuralnet/MapRankingTest.java
index d835088..7beb2d5 100644
--- a/src/test/java/org/apache/commons/math4/ml/neuralnet/MapRankingTest.java
+++ b/src/test/java/org/apache/commons/math4/ml/neuralnet/MapRankingTest.java
@@ -19,7 +19,7 @@ package org.apache.commons.math4.ml.neuralnet;
 
 import java.util.Set;
 import java.util.HashSet;
-
+import java.util.List;
 import org.apache.commons.math4.exception.NotStrictlyPositiveException;
 import org.apache.commons.math4.ml.distance.DistanceMeasure;
 import org.apache.commons.math4.ml.distance.EuclideanDistance;
@@ -105,4 +105,21 @@ public class MapRankingTest {
         new MapRanking(new NeuronString(3, false, initArray).getNetwork(),
                        new EuclideanDistance()).rank(new double[] { -1 }, 0);
     }
+
+    @Test
+    public void testSort() {
+        final Set<Neuron> list = new HashSet<>();
+
+        for (int i = 0; i < 4; i++) {
+            list.add(new Neuron(i, new double[] { i - 0.5 }));
+        }
+
+        final MapRanking rank = new MapRanking(list, new EuclideanDistance());
+        final List<Neuron> sorted = rank.rank(new double[] { 3.4 });
+
+        final long[] expected = new long[] { 3, 2, 1, 0 };
+        for (int i = 0; i < list.size(); i++) {
+            Assert.assertEquals(expected[i], sorted.get(i).getIdentifier());
+        }
+    }
 }
diff --git a/src/test/java/org/apache/commons/math4/ml/neuralnet/MapUtilsTest.java b/src/test/java/org/apache/commons/math4/ml/neuralnet/MapUtilsTest.java
deleted file mode 100644
index 4404aac..0000000
--- a/src/test/java/org/apache/commons/math4/ml/neuralnet/MapUtilsTest.java
+++ /dev/null
@@ -1,115 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.commons.math4.ml.neuralnet;
-
-import java.util.Set;
-import java.util.HashSet;
-
-import org.apache.commons.math4.ml.distance.DistanceMeasure;
-import org.apache.commons.math4.ml.distance.EuclideanDistance;
-import org.apache.commons.math4.ml.neuralnet.FeatureInitializer;
-import org.apache.commons.math4.ml.neuralnet.FeatureInitializerFactory;
-import org.apache.commons.math4.ml.neuralnet.MapUtils;
-import org.apache.commons.math4.ml.neuralnet.Network;
-import org.apache.commons.math4.ml.neuralnet.Neuron;
-import org.apache.commons.math4.ml.neuralnet.oned.NeuronString;
-import org.junit.Test;
-import org.junit.Assert;
-
-/**
- * Tests for {@link MapUtils} class.
- */
-public class MapUtilsTest {
-    /*
-     * Test assumes that the network is
-     *
-     *  0-----1-----2
-     */
-    @Test
-    public void testFindClosestNeuron() {
-        final FeatureInitializer init
-            = new OffsetFeatureInitializer(FeatureInitializerFactory.uniform(-0.1, 0.1));
-        final FeatureInitializer[] initArray = { init };
-
-        final Network net = new NeuronString(3, false, initArray).getNetwork();
-        final DistanceMeasure dist = new EuclideanDistance();
-
-        final Set<Neuron> allBest = new HashSet<>();
-        final Set<Neuron> best = new HashSet<>();
-        double[][] features;
-
-        // The following tests ensures that
-        // 1. the same neuron is always selected when the input feature is
-        //    in the range of the initializer,
-        // 2. different network's neuron have been selected by inputs features
-        //    that belong to different ranges.
-
-        best.clear();
-        features = new double[][] {
-            { -1 },
-            { 0.4 },
-        };
-        for (double[] f : features) {
-            best.add(MapUtils.findBest(f, net, dist));
-        }
-        Assert.assertEquals(1, best.size());
-        allBest.addAll(best);
-
-        best.clear();
-        features = new double[][] {
-            { 0.6 },
-            { 1.4 },
-        };
-        for (double[] f : features) {
-            best.add(MapUtils.findBest(f, net, dist));
-        }
-        Assert.assertEquals(1, best.size());
-        allBest.addAll(best);
-
-        best.clear();
-        features = new double[][] {
-            { 1.6 },
-            { 3 },
-        };
-        for (double[] f : features) {
-            best.add(MapUtils.findBest(f, net, dist));
-        }
-        Assert.assertEquals(1, best.size());
-        allBest.addAll(best);
-
-        Assert.assertEquals(3, allBest.size());
-    }
-
-    @Test
-    public void testSort() {
-        final Set<Neuron> list = new HashSet<>();
-
-        for (int i = 0; i < 4; i++) {
-            list.add(new Neuron(i, new double[] { i - 0.5 }));
-        }
-
-        final Neuron[] sorted = MapUtils.sort(new double[] { 3.4 },
-                                              list,
-                                              new EuclideanDistance());
-
-        final long[] expected = new long[] { 3, 2, 1, 0 };
-        for (int i = 0; i < list.size(); i++) {
-            Assert.assertEquals(expected[i], sorted[i].getIdentifier());
-        }
-    }
-}
diff --git a/src/test/java/org/apache/commons/math4/ml/neuralnet/sofm/KohonenUpdateActionTest.java b/src/test/java/org/apache/commons/math4/ml/neuralnet/sofm/KohonenUpdateActionTest.java
index 7a064cf..fbd7fc1 100644
--- a/src/test/java/org/apache/commons/math4/ml/neuralnet/sofm/KohonenUpdateActionTest.java
+++ b/src/test/java/org/apache/commons/math4/ml/neuralnet/sofm/KohonenUpdateActionTest.java
@@ -21,7 +21,7 @@ import org.apache.commons.math4.ml.distance.DistanceMeasure;
 import org.apache.commons.math4.ml.distance.EuclideanDistance;
 import org.apache.commons.math4.ml.neuralnet.FeatureInitializer;
 import org.apache.commons.math4.ml.neuralnet.FeatureInitializerFactory;
-import org.apache.commons.math4.ml.neuralnet.MapUtils;
+import org.apache.commons.math4.ml.neuralnet.MapRanking;
 import org.apache.commons.math4.ml.neuralnet.Network;
 import org.apache.commons.math4.ml.neuralnet.Neuron;
 import org.apache.commons.math4.ml.neuralnet.OffsetFeatureInitializer;
@@ -58,6 +58,7 @@ public class KohonenUpdateActionTest {
         final NeighbourhoodSizeFunction neighbourhood
             = NeighbourhoodSizeFunctionFactory.exponentialDecay(3, 1, 100);
         final UpdateAction update = new KohonenUpdateAction(dist, learning, neighbourhood);
+        final MapRanking rank = new MapRanking(net, dist);
 
         // The following test ensures that, after one "update",
         // 1. when the initial learning rate equal to 1, the best matching
@@ -71,7 +72,7 @@ public class KohonenUpdateActionTest {
         for (Neuron n : net) {
             distancesBefore[count++] = dist.compute(n.getFeatures(), features);
         }
-        final Neuron bestBefore = MapUtils.findBest(features, net, dist);
+        final Neuron bestBefore = rank.rank(features, 1).get(0);
 
         // Initial distance from the best match is larger than zero.
         Assert.assertTrue(dist.compute(bestBefore.getFeatures(), features) >= 0.2);
@@ -83,7 +84,7 @@ public class KohonenUpdateActionTest {
         for (Neuron n : net) {
             distancesAfter[count++] = dist.compute(n.getFeatures(), features);
         }
-        final Neuron bestAfter = MapUtils.findBest(features, net, dist);
+        final Neuron bestAfter = rank.rank(features, 1).get(0);
 
         Assert.assertEquals(bestBefore, bestAfter);
         // Distance is now zero.


[commons-math] 05/08: MATH-1548: Move standard quality measures of a SOM into class "NeuronSquareMesh2D".

Posted by er...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

erans pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/commons-math.git

commit 28e5b802fe0b406cfba5605d1537acb0e327d177
Author: Gilles Sadowski <gi...@gmail.com>
AuthorDate: Fri Jun 26 18:05:12 2020 +0200

    MATH-1548: Move standard quality measures of a SOM into class "NeuronSquareMesh2D".
    
    All these indicators are usually computed in order to evaluate the quality of a SOM:
    Computing them separately is inefficient when the number of samples becomes large.
---
 .../ml/neuralnet/twod/NeuronSquareMesh2D.java      | 239 +++++++++++++++++++++
 .../math4/ml/neuralnet/twod/util/HitHistogram.java |  85 --------
 .../ml/neuralnet/twod/util/QuantizationError.java  |  78 -------
 .../twod/util/TopographicErrorHistogram.java       |  93 --------
 .../neuralnet/twod/util/UnifiedDistanceMatrix.java |  81 ++-----
 .../ml/neuralnet/twod/NeuronSquareMesh2DTest.java  |  40 ++++
 6 files changed, 292 insertions(+), 324 deletions(-)

diff --git a/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/NeuronSquareMesh2D.java b/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/NeuronSquareMesh2D.java
index 3a5a126..17055da 100644
--- a/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/NeuronSquareMesh2D.java
+++ b/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/NeuronSquareMesh2D.java
@@ -20,6 +20,7 @@ package org.apache.commons.math4.ml.neuralnet.twod;
 import java.util.List;
 import java.util.ArrayList;
 import java.util.Iterator;
+import java.util.Collection;
 import java.io.Serializable;
 import java.io.ObjectInputStream;
 
@@ -30,6 +31,10 @@ import org.apache.commons.math4.ml.neuralnet.FeatureInitializer;
 import org.apache.commons.math4.ml.neuralnet.Network;
 import org.apache.commons.math4.ml.neuralnet.Neuron;
 import org.apache.commons.math4.ml.neuralnet.SquareNeighbourhood;
+import org.apache.commons.math4.ml.neuralnet.MapRanking;
+import org.apache.commons.math4.ml.neuralnet.twod.util.LocationFinder;
+import org.apache.commons.math4.ml.distance.DistanceMeasure;
+import org.apache.commons.math4.ml.distance.EuclideanDistance;
 
 /**
  * Neural network with the topology of a two-dimensional surface.
@@ -340,6 +345,17 @@ public class NeuronSquareMesh2D
     }
 
     /**
+     * Computes various {@link DataVisualization indicators} of the quality
+     * of the representation of the given {@code data} by this map.
+     *
+     * @param data Features.
+     * @return a new instance holding quality indicators.
+     */
+    public DataVisualization computeQualityIndicators(Iterable<double[]> data) {
+        return DataVisualization.from(copy(), data);
+    }
+
+    /**
      * Computes the location of a neighbouring neuron.
      * Returns {@code null} if the resulting location is not part
      * of the map.
@@ -625,4 +641,227 @@ public class NeuronSquareMesh2D
                                           featuresList);
         }
     }
+
+    /**
+     * Miscellaneous indicators of the map quality:
+     * <ul>
+     *  <li>Hit histogram</li>
+     *  <li>Quantization error</li>
+     *  <li>Topographic error</li>
+     *  <li>Unified distance matrix</li>
+     * </ul>
+     */
+    public static class DataVisualization {
+        /** Distance function. */
+        private static final DistanceMeasure DISTANCE = new EuclideanDistance();
+        /** Total number of samples. */
+        private final int numberOfSamples;
+        /** Hit histogram. */
+        private final double[][] hitHistogram;
+        /** Quantization error. */
+        private final double[][] quantizationError;
+        /** Mean quantization error. */
+        private final double meanQuantizationError;
+        /** Topographic error. */
+        private final double[][] topographicError;
+        /** Mean topographic error. */
+        private final double meanTopographicError;
+        /** U-matrix. */
+        private final double[][] uMatrix;
+
+        /**
+         * @param numberOfSamples Number of samples.
+         * @param hitHistogram Hit histogram.
+         * @param quantizationError Quantization error.
+         * @param topographicError Topographic error.
+         * @param uMatrix U-matrix.
+         */
+        private DataVisualization(int numberOfSamples,
+                                  double[][] hitHistogram,
+                                  double[][] quantizationError,
+                                  double[][] topographicError,
+                                  double[][] uMatrix) {
+            this.numberOfSamples = numberOfSamples;
+            this.hitHistogram = hitHistogram;
+            this.quantizationError = quantizationError;
+            meanQuantizationError = hitWeightedMean(quantizationError, hitHistogram);
+            this.topographicError = topographicError;
+            meanTopographicError = hitWeightedMean(topographicError, hitHistogram);
+            this.uMatrix = uMatrix;
+        }
+
+        /**
+         * @param map Map
+         * @param data Data.
+         * @return the metrics.
+         */
+        static DataVisualization from(NeuronSquareMesh2D map,
+                                      Iterable<double[]> data) {
+            final LocationFinder finder = new LocationFinder(map);
+            final MapRanking rank = new MapRanking(map, DISTANCE);
+            final Network net = map.getNetwork();
+            final int nR = map.getNumberOfRows();
+            final int nC = map.getNumberOfColumns();
+
+            // Hit bins.
+            final int[][] hitCounter = new int[nR][nC];
+            // Hit bins.
+            final double[][] hitHistogram = new double[nR][nC];
+            // Quantization error bins.
+            final double[][] quantizationError = new double[nR][nC];
+            // Topographic error bins.
+            final double[][] topographicError = new double[nR][nC];
+            // U-matrix.
+            final double[][] uMatrix = new double[nR][nC];
+
+            int numSamples = 0;
+            for (double[] sample : data) {
+                ++numSamples;
+
+                final List<Neuron> winners = rank.rank(sample, 2);
+                final Neuron best = winners.get(0);
+                final Neuron secondBest = winners.get(1);
+
+                final LocationFinder.Location locBest = finder.getLocation(best);
+                final int rowBest = locBest.getRow();
+                final int colBest = locBest.getColumn();
+                // Increment hit counter.
+                hitCounter[rowBest][colBest] += 1;
+
+                // Aggregate quantization error.
+                quantizationError[rowBest][colBest] += DISTANCE.compute(sample, best.getFeatures());
+
+                // Aggregate topographic error.
+                if (!net.getNeighbours(best).contains(secondBest)) {
+                    // Increment count if first and second best matching units
+                    // are not neighbours.
+                    topographicError[rowBest][colBest] += 1;
+                }
+            }
+
+            for (int r = 0; r < nR; r++) {
+                for (int c = 0; c < nC; c++) {
+                    final Neuron neuron = map.getNeuron(r, c);
+                    final Collection<Neuron> neighbours = net.getNeighbours(neuron);
+                    final double[] features = neuron.getFeatures();
+                    double uDistance = 0;
+                    int neighbourCount = 0;
+                    for (Neuron n : neighbours) {
+                        ++neighbourCount;
+                        uDistance += DISTANCE.compute(features, n.getFeatures());
+                    }
+
+                    final int hitCount = hitCounter[r][c];
+                    if (hitCount != 0) {
+                        hitHistogram[r][c] = hitCount / (double) numSamples;
+                        quantizationError[r][c] /= hitCount;
+                        topographicError[r][c] /= hitCount;
+                        uMatrix[r][c] = uDistance / neighbourCount;
+                    }
+                }
+            }
+
+            return new DataVisualization(numSamples,
+                                         hitHistogram,
+                                         quantizationError,
+                                         topographicError,
+                                         uMatrix);
+        }
+
+        /**
+         * @return the total number of samples.
+         */
+        public final int getNumberOfSamples() {
+            return numberOfSamples;
+        }
+
+        /**
+         * @return the quantization error.
+         * Each bin will contain the average of the distances between samples
+         * mapped to the corresponding unit and the weight vector of that unit.
+         * @see #getMeanQuantizationError()
+         */
+        public double[][] getQuantizationError() {
+            return copy(quantizationError);
+        }
+
+        /**
+         * @return the topographic error.
+         * Each bin will contain the number of data for which the first and
+         * second best matching units are not adjacent in the map.
+         * @see #getMeanTopographicError()
+         */
+        public double[][] getTopographicError() {
+            return copy(topographicError);
+        }
+
+        /**
+         * @return the hits histogram (normalized).
+         * Each bin will contain the number of data for which the corresponding
+         * neuron is the best matching unit.
+         */
+        public double[][] getNormalizedHits() {
+            return copy(hitHistogram);
+        }
+
+        /**
+         * @return the U-matrix.
+         * Each bin will contain the average distance between a unit and all its
+         * neighbours will be computed (and stored in the pixel corresponding to
+         * that unit of the 2D-map).  The number of neighbours taken into account
+         * depends on the network {@link org.apache.commons.math4.ml.neuralnet.SquareNeighbourhood
+         * neighbourhood type}.
+         */
+        public double[][] getUMatrix() {
+            return copy(uMatrix);
+        }
+
+        /**
+         * @return the mean (hit-weighted) quantization error.
+         * @see #getQuantizationError()
+         */
+        public double getMeanQuantizationError() {
+            return meanQuantizationError;
+        }
+
+        /**
+         * @return the mean (hit-weighted) topographic error.
+         * @see #getTopographicError()
+         */
+        public double getMeanTopographicError() {
+            return meanTopographicError;
+        }
+
+        /**
+         * @param orig Source.
+         * @return a deep copy of the original array.
+         */
+        private static double[][] copy(double[][] orig) {
+            final double[][] copy = new double[orig.length][];
+            for (int i = 0; i < orig.length; i++) {
+                copy[i] = orig[i].clone();
+            }
+
+            return copy;
+        }
+
+        /**
+         * @param metrics Metrics.
+         * @param normalizedHits Hits histogram (normalized).
+         * @return the hit-weighted mean of the given {@code metrics}.
+         */
+        private double hitWeightedMean(double[][] metrics,
+                                       double[][] normalizedHits) {
+            double mean = 0;
+            final int rows = metrics.length;
+            final int cols = metrics[0].length;
+            for (int i = 0; i < rows; i++) {
+                for (int j = 0; j < cols; j++) {
+                    mean += normalizedHits[i][j] * metrics[i][j];
+                }
+            }
+
+            return mean;
+        }
+    }
 }
diff --git a/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/util/HitHistogram.java b/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/util/HitHistogram.java
deleted file mode 100644
index a88fa1d..0000000
--- a/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/util/HitHistogram.java
+++ /dev/null
@@ -1,85 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *      http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.commons.math4.ml.neuralnet.twod.util;
-
-import org.apache.commons.math4.ml.neuralnet.MapRanking;
-import org.apache.commons.math4.ml.neuralnet.Neuron;
-import org.apache.commons.math4.ml.neuralnet.twod.NeuronSquareMesh2D;
-import org.apache.commons.math4.ml.distance.DistanceMeasure;
-
-/**
- * Computes the hit histogram.
- * Each bin will contain the number of data for which the corresponding
- * neuron is the best matching unit.
- * @since 3.6
- */
-public class HitHistogram implements MapDataVisualization {
-    /** Distance. */
-    private final DistanceMeasure distance;
-    /** Whether to compute relative bin counts. */
-    private final boolean normalizeCount;
-
-    /**
-     * @param normalizeCount Whether to compute relative bin counts.
-     * If {@code true}, the data count in each bin will be divided by the total
-     * number of samples.
-     * @param distance Distance.
-     */
-    public HitHistogram(boolean normalizeCount,
-                        DistanceMeasure distance) {
-        this.normalizeCount = normalizeCount;
-        this.distance = distance;
-    }
-
-    /** {@inheritDoc} */
-    @Override
-    public double[][] computeImage(NeuronSquareMesh2D map,
-                                   Iterable<double[]> data) {
-        final int nR = map.getNumberOfRows();
-        final int nC = map.getNumberOfColumns();
-
-        final LocationFinder finder = new LocationFinder(map);
-        final MapRanking rank = new MapRanking(map.getNetwork(), distance);
-
-        // Totla number of samples.
-        int numSamples = 0;
-        // Hit bins.
-        final double[][] hit = new double[nR][nC];
-
-        for (double[] sample : data) {
-            final Neuron best = rank.rank(sample, 1).get(0);
-
-            final LocationFinder.Location loc = finder.getLocation(best);
-            final int row = loc.getRow();
-            final int col = loc.getColumn();
-            hit[row][col] += 1;
-
-            ++numSamples;
-        }
-
-        if (normalizeCount) {
-            for (int r = 0; r < nR; r++) {
-                for (int c = 0; c < nC; c++) {
-                    hit[r][c] /= numSamples;
-                }
-            }
-        }
-
-        return hit;
-    }
-}
diff --git a/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/util/QuantizationError.java b/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/util/QuantizationError.java
deleted file mode 100644
index f2bc9de..0000000
--- a/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/util/QuantizationError.java
+++ /dev/null
@@ -1,78 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *      http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.commons.math4.ml.neuralnet.twod.util;
-
-import org.apache.commons.math4.ml.neuralnet.MapRanking;
-import org.apache.commons.math4.ml.neuralnet.Neuron;
-import org.apache.commons.math4.ml.neuralnet.twod.NeuronSquareMesh2D;
-import org.apache.commons.math4.ml.distance.DistanceMeasure;
-
-/**
- * Computes the quantization error histogram.
- * Each bin will contain the average of the distances between samples
- * mapped to the corresponding unit and the weight vector of that unit.
- * @since 3.6
- */
-public class QuantizationError implements MapDataVisualization {
-    /** Distance. */
-    private final DistanceMeasure distance;
-
-    /**
-     * @param distance Distance.
-     */
-    public QuantizationError(DistanceMeasure distance) {
-        this.distance = distance;
-    }
-
-    /** {@inheritDoc} */
-    @Override
-    public double[][] computeImage(NeuronSquareMesh2D map,
-                                   Iterable<double[]> data) {
-        final int nR = map.getNumberOfRows();
-        final int nC = map.getNumberOfColumns();
-
-        final LocationFinder finder = new LocationFinder(map);
-        final MapRanking rank = new MapRanking(map.getNetwork(), distance);
-
-        // Hit bins.
-        final int[][] hit = new int[nR][nC];
-        // Error bins.
-        final double[][] error = new double[nR][nC];
-
-        for (double[] sample : data) {
-            final Neuron best = rank.rank(sample, 1).get(0);
-
-            final LocationFinder.Location loc = finder.getLocation(best);
-            final int row = loc.getRow();
-            final int col = loc.getColumn();
-            hit[row][col] += 1;
-            error[row][col] += distance.compute(sample, best.getFeatures());
-        }
-
-        for (int r = 0; r < nR; r++) {
-            for (int c = 0; c < nC; c++) {
-                final int count = hit[r][c];
-                if (count != 0) {
-                    error[r][c] /= count;
-                }
-            }
-        }
-
-        return error;
-    }
-}
diff --git a/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/util/TopographicErrorHistogram.java b/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/util/TopographicErrorHistogram.java
deleted file mode 100644
index 758e672..0000000
--- a/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/util/TopographicErrorHistogram.java
+++ /dev/null
@@ -1,93 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *      http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.commons.math4.ml.neuralnet.twod.util;
-
-import java.util.List;
-import org.apache.commons.math4.ml.neuralnet.MapRanking;
-import org.apache.commons.math4.ml.neuralnet.Neuron;
-import org.apache.commons.math4.ml.neuralnet.Network;
-import org.apache.commons.math4.ml.neuralnet.twod.NeuronSquareMesh2D;
-import org.apache.commons.math4.ml.distance.DistanceMeasure;
-
-/**
- * Computes the topographic error histogram.
- * Each bin will contain the number of data for which the first and
- * second best matching units are not adjacent in the map.
- * @since 3.6
- */
-public class TopographicErrorHistogram implements MapDataVisualization {
-    /** Distance. */
-    private final DistanceMeasure distance;
-    /** Whether to compute relative bin counts. */
-    private final boolean relativeCount;
-
-    /**
-     * @param relativeCount Whether to compute relative bin counts.
-     * If {@code true}, the data count in each bin will be divided by the total
-     * number of samples mapped to the neuron represented by that bin.
-     * @param distance Distance.
-     */
-    public TopographicErrorHistogram(boolean relativeCount,
-                                     DistanceMeasure distance) {
-        this.relativeCount = relativeCount;
-        this.distance = distance;
-    }
-
-    /** {@inheritDoc} */
-    @Override
-    public double[][] computeImage(NeuronSquareMesh2D map,
-                                   Iterable<double[]> data) {
-        final int nR = map.getNumberOfRows();
-        final int nC = map.getNumberOfColumns();
-
-        final LocationFinder finder = new LocationFinder(map);
-        final Network net = map.getNetwork();
-        final MapRanking rank = new MapRanking(net, distance);
-
-        // Hit bins.
-        final int[][] hit = new int[nR][nC];
-        // Error bins.
-        final double[][] error = new double[nR][nC];
-
-        for (double[] sample : data) {
-            final List<Neuron> p = rank.rank(sample, 2);
-            final Neuron best = p.get(0);
-
-            final LocationFinder.Location loc = finder.getLocation(best);
-            final int row = loc.getRow();
-            final int col = loc.getColumn();
-            hit[row][col] += 1;
-
-            if (!net.getNeighbours(best).contains(p.get(1))) {
-                // Increment count if first and second best matching units
-                // are not neighbours.
-                error[row][col] += 1;
-            }
-        }
-
-        if (relativeCount) {
-            for (int r = 0; r < nR; r++) {
-                for (int c = 0; c < nC; c++) {
-                    error[r][c] /= hit[r][c];
-                }
-            }
-        }
-
-        return error;
-    }
-}
diff --git a/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/util/UnifiedDistanceMatrix.java b/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/util/UnifiedDistanceMatrix.java
index 0fa2002..9fa9318 100644
--- a/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/util/UnifiedDistanceMatrix.java
+++ b/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/util/UnifiedDistanceMatrix.java
@@ -17,59 +17,36 @@
 
 package org.apache.commons.math4.ml.neuralnet.twod.util;
 
-import java.util.Collection;
 import org.apache.commons.math4.ml.neuralnet.Neuron;
-import org.apache.commons.math4.ml.neuralnet.Network;
 import org.apache.commons.math4.ml.neuralnet.twod.NeuronSquareMesh2D;
 import org.apache.commons.math4.ml.distance.DistanceMeasure;
 
 /**
  * <a href="http://en.wikipedia.org/wiki/U-Matrix">U-Matrix</a>
  * visualization of high-dimensional data projection.
+ * The 8 individual inter-units distances will be
+ * {@link #computeImage(NeuronSquareMesh2D) computed}.  They will be
+ * stored in additional pixels around each of the original units of the
+ * 2D-map.  The additional pixels that lie along a "diagonal" are shared
+ * by <em>two</em> pairs of units: their value will be set to the average
+ * distance between the units belonging to each of the pairs.  The value
+ * zero will be stored in the pixel corresponding to the location of a
+ * unit of the 2D-map.
+ *
  * @since 3.6
+ * @see NeuronSquareMesh2D.DataVisualization#getUMatrix()
  */
 public class UnifiedDistanceMatrix implements MapVisualization {
-    /** Whether to show distance between each pair of neighbouring units. */
-    private final boolean individualDistances;
     /** Distance. */
     private final DistanceMeasure distance;
 
     /**
-     * Simple constructor.
-     *
-     * @param individualDistances If {@code true}, the 8 individual
-     * inter-units distances will be {@link #computeImage(NeuronSquareMesh2D)
-     * computed}.  They will be stored in additional pixels around each of
-     * the original units of the 2D-map.  The additional pixels that lie
-     * along a "diagonal" are shared by <em>two</em> pairs of units: their
-     * value will be set to the average distance between the units belonging
-     * to each of the pairs.  The value zero will be stored in the pixel
-     * corresponding to the location of a unit of the 2D-map.
-     * <br>
-     * If {@code false}, only the average distance between a unit and all its
-     * neighbours will be computed (and stored in the pixel corresponding to
-     * that unit of the 2D-map).  In that case, the number of neighbours taken
-     * into account depends on the network's
-     * {@link org.apache.commons.math4.ml.neuralnet.SquareNeighbourhood
-     * neighbourhood type}.
      * @param distance Distance.
      */
-    public UnifiedDistanceMatrix(boolean individualDistances,
-                                 DistanceMeasure distance) {
-        this.individualDistances = individualDistances;
+    public UnifiedDistanceMatrix(DistanceMeasure distance) {
         this.distance = distance;
     }
 
-    /** {@inheritDoc} */
-    @Override
-    public double[][] computeImage(NeuronSquareMesh2D map) {
-        if (individualDistances) {
-            return individualDistances(map);
-        } else {
-            return averageDistances(map);
-        }
-    }
-
     /**
      * Computes the distances between a unit of the map and its
      * neighbours.
@@ -81,7 +58,8 @@ public class UnifiedDistanceMatrix implements MapVisualization {
      * @param map Map.
      * @return an image representing the individual distances.
      */
-    private double[][] individualDistances(NeuronSquareMesh2D map) {
+    @Override
+    public double[][] computeImage(NeuronSquareMesh2D map) {
         final int numRows = map.getNumberOfRows();
         final int numCols = map.getNumberOfColumns();
 
@@ -174,37 +152,4 @@ public class UnifiedDistanceMatrix implements MapVisualization {
 
         return uMatrix;
     }
-
-    /**
-     * Computes the distances between a unit of the map and its neighbours.
-     *
-     * @param map Map.
-     * @return an image representing the average distances.
-     */
-    private double[][] averageDistances(NeuronSquareMesh2D map) {
-        final int numRows = map.getNumberOfRows();
-        final int numCols = map.getNumberOfColumns();
-        final double[][] uMatrix = new double[numRows][numCols];
-
-        final Network net = map.getNetwork();
-
-        for (int i = 0; i < numRows; i++) {
-            for (int j = 0; j < numCols; j++) {
-                final Neuron neuron = map.getNeuron(i, j);
-                final Collection<Neuron> neighbours = net.getNeighbours(neuron);
-                final double[] features = neuron.getFeatures();
-
-                double d = 0;
-                int count = 0;
-                for (Neuron n : neighbours) {
-                    ++count;
-                    d += distance.compute(features, n.getFeatures());
-                }
-
-                uMatrix[i][j] = d / count;
-            }
-        }
-
-        return uMatrix;
-    }
 }
diff --git a/src/test/java/org/apache/commons/math4/ml/neuralnet/twod/NeuronSquareMesh2DTest.java b/src/test/java/org/apache/commons/math4/ml/neuralnet/twod/NeuronSquareMesh2DTest.java
index 693f59b..55732a7 100644
--- a/src/test/java/org/apache/commons/math4/ml/neuralnet/twod/NeuronSquareMesh2DTest.java
+++ b/src/test/java/org/apache/commons/math4/ml/neuralnet/twod/NeuronSquareMesh2DTest.java
@@ -25,6 +25,9 @@ import java.io.ObjectOutputStream;
 import java.util.Collection;
 import java.util.Set;
 import java.util.HashSet;
+import java.util.List;
+import java.util.stream.StreamSupport;
+import java.util.stream.Collectors;
 
 import org.apache.commons.math4.exception.NumberIsTooSmallException;
 import org.apache.commons.math4.exception.OutOfRangeException;
@@ -872,4 +875,41 @@ public class NeuronSquareMesh2DTest {
             Assert.assertTrue(fromMap.contains(n));
         }
     }
+
+    @Test
+    public void testDataVisualization() {
+        final FeatureInitializer[] initArray = { init };
+        final NeuronSquareMesh2D map = new NeuronSquareMesh2D(3, true,
+                                                              3, true,
+                                                              SquareNeighbourhood.VON_NEUMANN,
+                                                              initArray);
+
+        // Trivial test: Use neurons' features as data.
+
+        final List<double[]> data = StreamSupport.stream(map.spliterator(), false)
+            .map(n -> n.getFeatures())
+            .collect(Collectors.toList());
+        final NeuronSquareMesh2D.DataVisualization v = map.computeQualityIndicators(data);
+
+        final int numRows = map.getNumberOfRows();
+        final int numCols = map.getNumberOfColumns();
+
+        // Test hits.
+        final double[][] hits = v.getNormalizedHits();
+        final double expectedHits = 1d / (numRows * numCols);
+        for (int i = 0; i < numRows; i++) {
+            for (int j = 0; j < numCols; j++) {
+                Assert.assertEquals(expectedHits, hits[i][j], 0d);
+            }
+        }
+
+        // Test quantization error.
+        final double[][] qe = v.getQuantizationError();
+        final double expectedQE = 0;
+        for (int i = 0; i < numRows; i++) {
+            for (int j = 0; j < numCols; j++) {
+                Assert.assertEquals(expectedQE, qe[i][j], 0d);
+            }
+        }
+    }
 }


[commons-math] 07/08: Condition does not apply.

Posted by er...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

erans pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/commons-math.git

commit 824d92fac676a5bbde947a437f1698960edf3bc6
Author: Gilles Sadowski <gi...@gmail.com>
AuthorDate: Sun Jun 28 11:13:24 2020 +0200

    Condition does not apply.
---
 .../org/apache/commons/math4/ml/neuralnet/twod/NeuronSquareMesh2D.java | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/NeuronSquareMesh2D.java b/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/NeuronSquareMesh2D.java
index 17055da..b30e06a 100644
--- a/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/NeuronSquareMesh2D.java
+++ b/src/main/java/org/apache/commons/math4/ml/neuralnet/twod/NeuronSquareMesh2D.java
@@ -756,8 +756,9 @@ public class NeuronSquareMesh2D
                         hitHistogram[r][c] = hitCount / (double) numSamples;
                         quantizationError[r][c] /= hitCount;
                         topographicError[r][c] /= hitCount;
-                        uMatrix[r][c] = uDistance / neighbourCount;
                     }
+
+                    uMatrix[r][c] = uDistance / neighbourCount;
                 }
             }
 


[commons-math] 08/08: Track changes.

Posted by er...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

erans pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/commons-math.git

commit 849d551b8a9507a2ae8dfb81db36aa544a81ee47
Author: Gilles Sadowski <gi...@gmail.com>
AuthorDate: Mon Jun 29 00:51:55 2020 +0200

    Track changes.
---
 src/changes/changes.xml | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/changes/changes.xml b/src/changes/changes.xml
index 46e5b45..0318eaa 100644
--- a/src/changes/changes.xml
+++ b/src/changes/changes.xml
@@ -54,6 +54,9 @@ If the output is not quite correct, check for invisible trailing spaces!
     </release>
 
     <release version="4.0" date="XXXX-XX-XX" description="">
+      <action dev="erans" type="fix" issue="MATH-1548">
+        Avoid inefficiencies in computing the standard quality measures of a SOFM.
+      </action>
       <action dev="erans" type="update" issue="MATH-1547">
         More flexible ranking of SOFM.
       </action>