You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@bahir.apache.org by lr...@apache.org on 2016/06/10 15:23:50 UTC

[11/50] [abbrv] bahir git commit: [SPARK-11812][PYSPARK] invFunc=None works properly with python's reduceByKeyAndWindow

[SPARK-11812][PYSPARK] invFunc=None works properly with python's reduceByKeyAndWindow

invFunc is optional and can be None. Instead of invFunc (the parameter) invReduceFunc (a local function) was checked for trueness (that is, not None, in this context). A local function is never None,
thus the case of invFunc=None (a common one when inverse reduction is not defined) was treated incorrectly, resulting in loss of data.

In addition, the docstring used wrong parameter names, also fixed.

Author: David Tolpin <da...@gmail.com>

Closes #9775 from dtolpin/master.


Project: http://git-wip-us.apache.org/repos/asf/bahir/repo
Commit: http://git-wip-us.apache.org/repos/asf/bahir/commit/0845b564
Tree: http://git-wip-us.apache.org/repos/asf/bahir/tree/0845b564
Diff: http://git-wip-us.apache.org/repos/asf/bahir/diff/0845b564

Branch: refs/heads/master
Commit: 0845b56452984ac72e0d144f66ab1bfcbd4323b9
Parents: 502fd82
Author: David Tolpin <da...@gmail.com>
Authored: Thu Nov 19 13:57:23 2015 -0800
Committer: Tathagata Das <ta...@gmail.com>
Committed: Thu Nov 19 13:57:23 2015 -0800

----------------------------------------------------------------------
 streaming-mqtt/python/dstream.py | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/bahir/blob/0845b564/streaming-mqtt/python/dstream.py
----------------------------------------------------------------------
diff --git a/streaming-mqtt/python/dstream.py b/streaming-mqtt/python/dstream.py
index 698336c..acec850 100644
--- a/streaming-mqtt/python/dstream.py
+++ b/streaming-mqtt/python/dstream.py
@@ -524,8 +524,8 @@ class DStream(object):
         `invFunc` can be None, then it will reduce all the RDDs in window, could be slower
         than having `invFunc`.
 
-        @param reduceFunc:     associative reduce function
-        @param invReduceFunc:  inverse function of `reduceFunc`
+        @param func:           associative reduce function
+        @param invFunc:        inverse function of `reduceFunc`
         @param windowDuration: width of the window; must be a multiple of this DStream's
                               batching interval
         @param slideDuration:  sliding interval of the window (i.e., the interval after which
@@ -556,7 +556,7 @@ class DStream(object):
                                     if kv[1] is not None else kv[0])
 
         jreduceFunc = TransformFunction(self._sc, reduceFunc, reduced._jrdd_deserializer)
-        if invReduceFunc:
+        if invFunc:
             jinvReduceFunc = TransformFunction(self._sc, invReduceFunc, reduced._jrdd_deserializer)
         else:
             jinvReduceFunc = None