You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@groovy.apache.org by pa...@apache.org on 2020/09/03 12:11:29 UTC

[groovy] 03/03: doco: add monoid functional pattern

This is an automated email from the ASF dual-hosted git repository.

paulk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/groovy.git

commit 84bfe55eeea48e59c09eb98a0761e029e00df4e6
Author: Paul King <pa...@asert.com.au>
AuthorDate: Thu Sep 3 22:11:07 2020 +1000

    doco: add monoid functional pattern
---
 src/spec/doc/fragment_design-pattern-monoid.adoc | 225 +++++++++++++++++++++++
 src/spec/test/DesignPatternsTest.groovy          |  89 +++++++++
 2 files changed, 314 insertions(+)

diff --git a/src/spec/doc/fragment_design-pattern-monoid.adoc b/src/spec/doc/fragment_design-pattern-monoid.adoc
new file mode 100644
index 0000000..0c72524
--- /dev/null
+++ b/src/spec/doc/fragment_design-pattern-monoid.adoc
@@ -0,0 +1,225 @@
+//////////////////////////////////////////
+
+  Licensed to the Apache Software Foundation (ASF) under one
+  or more contributor license agreements.  See the NOTICE file
+  distributed with this work for additional information
+  regarding copyright ownership.  The ASF licenses this file
+  to you under the Apache License, Version 2.0 (the
+  "License"); you may not use this file except in compliance
+  with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing,
+  software distributed under the License is distributed on an
+  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  KIND, either express or implied.  See the License for the
+  specific language governing permissions and limitations
+  under the License.
+
+//////////////////////////////////////////
+
+= Using Monoids
+
+
+https://en.wikipedia.org/wiki/Monoid#Monoids_in_computer_science[Monoids] allow
+the mechanics of an aggregation algorithm to be separated from the algorithm-specific logic associated with that aggregation.
+It is often thought to be a functional design pattern.
+
+Perhaps, it is easiest seen with an example. Consider the code for integer sum, integer product and string concatenation.
+We might note various similarities:
+
+[source,groovy]
+----
+include::../test/DesignPatternsTest.groovy[tags=monoids_intro,indent=0]
+----
+<1> Initialize an aggregate counter
+<2> Loop throw elements with for/while/iteration adjusting counter
+
+We can remove the duplicate aggregation coding and the tease out the important differences for each algorithm.
+We might instead use Groovy's `inject` method. This is a _fold_ operation in functional programming jargon.
+
+[source,groovy]
+----
+include::../test/DesignPatternsTest.groovy[tags=monoids_inject,indent=0]
+----
+
+Here the first parameter is the initial value, and the supplied closure contains the algorithm-specific logic.
+
+Similarly, we can use the JDK stream API and lambda syntax as follows:
+
+[source,groovy]
+----
+include::../test/DesignPatternsTest.groovy[tags=monoids_lambdas,indent=0]
+----
+
+== A touch of formalism
+
+Looking at these examples, we might think all aggregation can be supported this way.
+In fact, we look for certain characteristics to ensure that this aggregation pattern will apply:
+
+* Closure: performing the aggregation step should produce a result of the same type as the elements being aggregated.
+****
+Examples: `1L + 3L` produces a `Long`, and `'foo' + 'bar'` produces a `String`. +
+Non-monoid examples: `'foo'.size() + 'bar'.size()` (takes strings, returns an integer),
+the type _odd numbers_ with respect to addition, algorithms which don't handle null arguments if such arguments are possible.
+****
+[NOTE]
+====
+When using the term _Closure_ here, we simply mean closed under the operation, not the Groovy `Closure` class.
+====
+* Associativity: the order in which we apply the aggregation step should not matter.
+****
+Examples: `(1 + 3) + 5` is the same as `1 + (3 + 5)`, and `('a' + 'b') + 'c'` is the same as `'a' + ('b' + 'c')`. +
+Non-monoid example: `(10 - 5) - 3` is not equal to `10 - (5 - 3)` therefore integers are not a monoid with respect to subtraction.
+****
+* Identity element (sometimes also called a 'zero' element):
+there should be an element which aggregated with any element returns the original element.
+****
+Examples: `0 + 42 == 42`, `42 + 0 == 42`, `1 * 42 == 42`, and `'' + 'foo' == 'foo'`. +
+Non-monoid example: the type _non-empty strings_ is not a monoid with respect to concatenation.
+****
+
+If your algorithm doesn't satisfy all the monoid properties, that doesn't mean aggregation isn't possible.
+It just means that you won't get all the benefits from monoids, which we'll cover shortly, or you might have
+a little more work to do.
+Also, you might be able to convert your data structures slightly to turn your problem into one involving monoids.
+We'll cover that topic a little later in this section.
+
+== Benefits of monoids
+
+Consider adding the integers 10 through 16.
+Because the operation of addition for integers is a monoid, we already know that we can save writing code
+and instead use the approach we saw in the earlier `inject` examples.
+There are some other nice properties.
+
+Because of the _closure_ property,
+if we have a pairwise method like `sum(Integer a, Integer b)`, then for a monoid, we can always
+extend that method to work with a list, e.g. `sum(List<Integer> nums)` or `sum(Integer first, Integer... rest)`.
+
+Because of _associativity_,
+we can employ some interesting ways to solve the aggregation including:
+
+* Divide and conquer algorithms which break the problem into smaller pieces
+* Various incremental algorithms for example memoization would allow summing from 1..5
+to potentially start part way through be reusing a cached value of summing 1..4 if that had been calculated earlier
+* Inherent parallelization can make use of multiple cores
+
+Let's just look at the first of these in more detail. With a multi-core
+processor, one core could add `10` plus `11`, another core `12` plus `13`, and so on.
+We'd use the _identity_ element if needed (shown being added to `16` in our example).
+Then the intermediate results could also be added together concurrently and so on until the result was reached.
+
+[plantuml, MonoidAddition, png]
+....
+skinparam shadowing false
+skinparam ClassFontSize 18
+skinparam ClassBackgroundColor<<Identity>> Transparent
+skinparam ClassBorderColor<<Identity>> grey
+skinparam ClassStereotypeFontSize<<Identity>> 4
+skinparam ClassStereotypeFontColor<<Identity>> Transparent
+hide circle
+class " 0 " as zero << (I,lightgrey) Identity >>
+show zero circle
+class " 10 " as a1
+class " 11 " as a2
+class " 12 " as a3
+class " 13 " as a4
+class " 14 " as a5
+class " 15 " as a6
+class " 16 " as a7
+class " 21 " as b1
+class " 25 " as b2
+class " 29 " as b3
+class " 16 " as b4
+class " 46 " as c1
+class " 45 " as c2
+class " 91 " as d1
+a1 .r[hidden].> a2
+a2 .r[hidden].> a3
+a3 .r[hidden].> a4
+a4 .r[hidden].> a5
+a5 .r[hidden].> a6
+a6 .r[hidden].> a7
+a7 .r[hidden].> zero
+b1 <.d. a1
+b1 <.d. a2
+b2 <.d. a3
+b2 <.d. a4
+b3 <.d. a5
+b3 <.d. a6
+b4 <.d. a7
+b4 <.d. zero
+c1 <.d. b1
+c1 <.d. b2
+c2 <.d. b3
+c2 <.d. b4
+d1 <.d. c1
+d1 <.d. c2
+hide empty members
+....
+
+We have reduced the amount of code we need to write, and we also have potential performance gains.
+
+Here is how we might code the previous example using the http://gpars.org/[GPars]
+concurrency and parallelism framework (two alternatives shown):
+
+[source,groovy]
+----
+include::../test/DesignPatternsTest.groovy[tags=monoids_gpars,indent=0]
+----
+
+== Working with Non-monoids
+
+Suppose we want to find the average of the numbers 1..10. Groovy has a built-in method for this:
+
+[source,groovy]
+----
+include::../test/DesignPatternsTest.groovy[tags=monoids_average_1to10,indent=0]
+----
+
+Now, suppose we want to build our own monoid solution instead of using the built-in version.
+It might seem difficult to find the _Identity_ element. After all:
+
+[source,groovy]
+----
+include::../test/DesignPatternsTest.groovy[tags=monoids_average_0to10,indent=0]
+----
+
+Similarly, if we are tempted to write the aggregation closure it might be something like:
+
+[source,groovy]
+----
+include::../test/DesignPatternsTest.groovy[tags=monoids_average_bad,indent=0]
+----
+
+What `b` can we use for the _Identity_ element here so that our equation returns the original?
+We need to use `a`, but that isn't a fixed value, so there is no _Identity_.
+
+Also, associativity doesn't hold for this initial attempt at defining `avg` as these examples show:
+
+[source,groovy]
+----
+include::../test/DesignPatternsTest.groovy[tags=monoids_average_assoc,indent=0]
+----
+
+Also, what about our _Closure_ property? Our original numbers were integers, but our average (`5.5`)
+is not. We can solve this by making our average work for any `Number` instances, but it might not always be this easy.
+
+It might appear that this problem is not amenable to a monoidal solution.
+However, there are numerous ways to bring monoids into the solution.
+We can split it into two parts:
+
+[source,groovy]
+----
+include::../test/DesignPatternsTest.groovy[tags=monoids_average_split,indent=0]
+----
+
+The calculation of `sum()` can follow monoid rules and then our last step can calculate the average.
+
+Alternatively, we can introduce a helper data structure that reworks the problem to be a monoid as follows:
+
+[source,groovy]
+----
+include::../test/DesignPatternsTest.groovy[tags=monoids_average_reworked,indent=0]
+----
diff --git a/src/spec/test/DesignPatternsTest.groovy b/src/spec/test/DesignPatternsTest.groovy
index 276a6c0..14a6965 100644
--- a/src/spec/test/DesignPatternsTest.groovy
+++ b/src/spec/test/DesignPatternsTest.groovy
@@ -1242,6 +1242,95 @@ class DesignPatternsTest extends CompilableTestSupport {
         '''
     }
 
+    void testMonoids() {
+        assertScript '''
+        // tag::monoids_intro[]
+        def nums = [1, 2, 3, 4]
+        
+        def sum = 0    // <1>
+        for (num in nums) { sum += num }    // <2>
+        assert sum == 10
+        
+        def product = 1    // <1>
+        for (num in nums) { product *= num }    // <2>
+        assert product == 24
+        
+        def letters = ['a', 'b', 'c']
+        
+        def concat = ''    // <1>
+        for (letter in letters) { concat += letter }    // <2>
+        assert concat == 'abc'
+        // end::monoids_intro[]
+        // tag::monoids_inject[]
+        assert nums.inject(0){ total, next -> total + next } == 10
+        assert nums.inject(1){ total, next -> total * next } == 24
+        assert letters.inject(''){ total, next -> total + next } == 'abc'
+        // end::monoids_inject[]
+        // tag::monoids_lambdas[]
+        assert nums.stream().reduce(0, (total, next) -> total + next) == 10
+        assert nums.stream().reduce(1, (total, next) -> total * next) == 24
+        assert letters.stream().reduce('', (total, next) -> total + next) == 'abc'
+        // end::monoids_lambdas[]
+        '''
+        assertScript '''
+        import groovyx.gpars.GParsPool
+
+        // tag::monoids_gpars[]
+        def nums = 10..16
+        GParsPool.withPool {
+            assert 91 == nums.injectParallel(0){ total, next -> total + next }
+            assert 91 == nums.parallel.reduce(0, (total, next) -> total + next)
+        }
+        // end::monoids_gpars[]
+        // tag::monoids_average_1to10[]
+        assert (1..10).average() == 5.5
+        // end::monoids_average_1to10[]
+        // tag::monoids_average_0to10[]
+        assert (0..10).average() == 5
+        // end::monoids_average_0to10[]
+        '''
+        assertScript '''
+        // tag::monoids_average_bad[]
+        def avg = { a, b -> (a + b) / 2 }
+        // end::monoids_average_bad[]
+        // tag::monoids_average_assoc[]
+        assert 6 == avg(avg(10, 2), 6)
+        assert 7 == avg(10, avg(2, 6))
+        // end::monoids_average_assoc[]
+        '''
+        assertScript '''
+        // tag::monoids_average_split[]
+        def nums = 1..10
+        def total = nums.sum()
+        def avg = total / nums.size()
+        assert avg == 5.5
+        // end::monoids_average_split[]
+        '''
+        assertScript '''
+        // tag::monoids_average_reworked[]
+        class AverageHolder {
+            int total
+            int count
+            AverageHolder plus(AverageHolder other) {
+                return new AverageHolder(total: total + other.total,
+                                         count: count + other.count)
+            }
+            static final AverageHolder ZERO =
+                new AverageHolder(total: 0, count: 0)
+        }
+        def nums = 1..10
+        def aggregate = nums.inject(AverageHolder.ZERO) { aggregate, next ->
+            if (next instanceof Integer) {
+                next = new AverageHolder(total: next, count : 1)
+            }
+            aggregate + next
+        }
+        def avg = aggregate.with{ total / count }
+        assert avg == 5.5
+        // end::monoids_average_reworked[]
+        '''
+    }
+
     void testNullObjectSimpleExample() {
         shouldCompile '''
             // tag::null_object_simple_example[]