You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by mengxr <gi...@git.apache.org> on 2015/04/15 22:16:20 UTC

[GitHub] spark pull request: [SPARK-6893][ML] default pipeline parameter ha...

GitHub user mengxr opened a pull request:

    https://github.com/apache/spark/pull/5534

    [SPARK-6893][ML] default pipeline parameter handling in python

    Same as #5431 but for Python. @jkbradley

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mengxr/spark SPARK-6893

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/5534.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5534
    
----
commit 5294500aafa2da6e6b7630b15de49ab0a264d93a
Author: Xiangrui Meng <me...@databricks.com>
Date:   2015-04-15T19:08:00Z

    update default param handling in python

commit 4d6b07a0388128424c3d1db738dd10c753c4783c
Author: Xiangrui Meng <me...@databricks.com>
Date:   2015-04-15T19:31:07Z

    add tests

commit fce244ef2ec99e2c450f26669d190ead5b927af2
Author: Xiangrui Meng <me...@databricks.com>
Date:   2015-04-15T20:11:50Z

    update explainParams with test

commit ebaccc6398c81b13af775769205c0fd19701138d
Author: Xiangrui Meng <me...@databricks.com>
Date:   2015-04-15T20:14:41Z

    style update

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-6893][ML] default pipeline parameter ha...

Posted by jkbradley <gi...@git.apache.org>.

Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5534#discussion_r28470153
  
    --- Diff: python/pyspark/ml/param/__init__.py ---
    @@ -25,23 +25,21 @@
     
     class Param(object):
         """
    -    A param with self-contained documentation and optionally default value.
    +    A param with self-contained documentation.
         """
     
    -    def __init__(self, parent, name, doc, defaultValue=None):
    -        if not isinstance(parent, Identifiable):
    -            raise ValueError("Parent must be identifiable but got type %s." % type(parent).__name__)
    +    def __init__(self, parent, name, doc):
    +        if not isinstance(parent, Params):
    +            raise ValueError("Parent must be a Params but got type %s." % type(parent).__name__)
             self.parent = parent
             self.name = str(name)
             self.doc = str(doc)
    -        self.defaultValue = defaultValue
     
         def __str__(self):
    -        return str(self.parent) + "-" + self.name
    +        return str(self.parent) + "__" + self.name
    --- End diff --
    
    Is this important?  Should it be tested somewhere?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-6893][ML] default pipeline parameter ha...

Posted by jkbradley <gi...@git.apache.org>.

Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5534#discussion_r28470161
  
    --- Diff: python/pyspark/ml/param/__init__.py ---
    @@ -67,11 +66,112 @@ def params(self):
             return filter(lambda attr: isinstance(attr, Param),
                           [getattr(self, x) for x in dir(self) if x != "params"])
     
    -    def _merge_params(self, params):
    -        paramMap = self.paramMap.copy()
    -        paramMap.update(params)
    +    def _explain(self, param):
    +        """
    +        Explains a single param and returns its name, doc, and optional
    +        default value and user-supplied value in a string.
    +        """
    +        param = self._resolveParam(param)
    +        values = []
    +        if self.isDefined(param):
    +            if param in self.defaultParamMap:
    +                values.append("default: %s" % self.defaultParamMap[param])
    +            if param in self.paramMap:
    +                values.append("current: %s" % self.paramMap[param])
    +        else:
    +            values.append("undefined")
    +        valueStr = "(" + ", ".join(values) + ")"
    +        return "%s: %s %s" % (param.name, param.doc, valueStr)
    +
    +    def explainParams(self):
    +        """
    +        Returns the documentation of all params with their optionally
    +        default values and user-supplied values.
    +        """
    +        return "\n".join([self._explain(param) for param in self.params])
    +
    +    def getParam(self, paramName):
    +        """
    +        Gets a param by its name.
    +        """
    +        param = getattr(self, paramName)
    +        if isinstance(param, Param):
    +            return param
    +        else:
    +            raise ValueError("Cannot find param with name %s." % paramName)
    +
    +    def isSet(self, param):
    +        """
    +        Checks whether a param is explicitly set by user.
    +        """
    +        param = self._resolveParam(param)
    +        return param in self.paramMap
    +
    +    def hasDefault(self, param):
    +        """
    +        Checks whether a param has a default value.
    +        """
    +        param = self._resolveParam(param)
    +        return param in self.defaultParamMap
    +
    +    def isDefined(self, param):
    +        """
    +        Checks whether a param is explicitly set by user or has a default value.
    +        """
    +        return self.isSet(param) or self.hasDefault(param)
    +
    +    def getOrDefault(self, param):
    +        """
    +        Gets the value of a param in the user-supplied param map or its
    +        default value. Raises an error if either is set.
    +        """
    +        if isinstance(param, Param):
    +            if param in self.paramMap:
    +                return self.paramMap[param]
    +            else:
    +                return self.defaultParamMap[param]
    +        elif isinstance(param, str):
    +            return self.getOrDefault(self.getParam(param))
    +        else:
    +            raise KeyError("Cannot recognize %r as a param." % param)
    +
    +    def extractParamMap(self, extraParamMap={}):
    +        """
    +        Extracts the embedded default param values and user-supplied
    +        values, and then merges them with extra values from input into
    +        a flat param map, where the latter values is used if there
    --- End diff --
    
    "is" --> "are"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-6893][ML] default pipeline parameter ha...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5534#issuecomment-93578861
  
      [Test build #30370 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30370/consoleFull) for   PR 5534 at commit [`ebaccc6`](https://github.com/apache/spark/commit/ebaccc6398c81b13af775769205c0fd19701138d).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.
     * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-6893][ML] default pipeline parameter ha...

Posted by mengxr <gi...@git.apache.org>.

Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5534#discussion_r28484297
  
    --- Diff: python/pyspark/ml/param/__init__.py ---
    @@ -52,10 +50,11 @@ class Params(Identifiable):
     
         __metaclass__ = ABCMeta
     
    -    def __init__(self):
    -        super(Params, self).__init__()
    -        #: embedded param map
    -        self.paramMap = {}
    +    #: internal param map for user-supplied values param map
    +    paramMap = {}
    --- End diff --
    
    That's the default behavior. `paramMap` doesn't need reference to `self`, so this is simpler.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-6893][ML] default pipeline parameter ha...

Posted by jkbradley <gi...@git.apache.org>.

Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5534#discussion_r28470158
  
    --- Diff: python/pyspark/ml/param/__init__.py ---
    @@ -52,10 +50,11 @@ class Params(Identifiable):
     
         __metaclass__ = ABCMeta
     
    -    def __init__(self):
    -        super(Params, self).__init__()
    -        #: embedded param map
    -        self.paramMap = {}
    +    #: internal param map for user-supplied values param map
    +    paramMap = {}
    +
    +    #: internal param map for default values
    +    defaultParamMap = {}
     
         @property
         def params(self):
    --- End diff --
    
    I should have done this previously, but can these be sorted by name (as in Scala)?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-6893][ML] default pipeline parameter ha...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/5534#issuecomment-93578876
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30370/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-6893][ML] default pipeline parameter ha...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5534#issuecomment-93644260
  
      [Test build #30396 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30396/consoleFull) for   PR 5534 at commit [`d3b519b`](https://github.com/apache/spark/commit/d3b519bfdcbfb54ac3f2b37b4cf500643d7a35de).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-6893][ML] default pipeline parameter ha...

Posted by mengxr <gi...@git.apache.org>.

Github user mengxr commented on the pull request:

    https://github.com/apache/spark/pull/5534#issuecomment-93658707
  
    Merged into master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-6893][ML] default pipeline parameter ha...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/5534#issuecomment-93657549
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30396/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-6893][ML] default pipeline parameter ha...

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/5534


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-6893][ML] default pipeline parameter ha...

Posted by mengxr <gi...@git.apache.org>.

Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5534#discussion_r28484298
  
    --- Diff: python/pyspark/ml/param/__init__.py ---
    @@ -25,23 +25,21 @@
     
     class Param(object):
         """
    -    A param with self-contained documentation and optionally default value.
    +    A param with self-contained documentation.
         """
     
    -    def __init__(self, parent, name, doc, defaultValue=None):
    -        if not isinstance(parent, Identifiable):
    -            raise ValueError("Parent must be identifiable but got type %s." % type(parent).__name__)
    +    def __init__(self, parent, name, doc):
    +        if not isinstance(parent, Params):
    +            raise ValueError("Parent must be a Params but got type %s." % type(parent).__name__)
             self.parent = parent
             self.name = str(name)
             self.doc = str(doc)
    -        self.defaultValue = defaultValue
     
         def __str__(self):
    -        return str(self.parent) + "-" + self.name
    +        return str(self.parent) + "__" + self.name
    --- End diff --
    
    This is to match sklearn's approach to specify parameters. This is just for display at this time.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-6893][ML] default pipeline parameter ha...

Posted by jkbradley <gi...@git.apache.org>.

Github user jkbradley commented on the pull request:

    https://github.com/apache/spark/pull/5534#issuecomment-93589218
  
    @mengxr  Looks good other than those minor comments above


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-6893][ML] default pipeline parameter ha...

Posted by jkbradley <gi...@git.apache.org>.

Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5534#discussion_r28470162
  
    --- Diff: python/pyspark/ml/param/__init__.py ---
    @@ -67,11 +66,112 @@ def params(self):
             return filter(lambda attr: isinstance(attr, Param),
                           [getattr(self, x) for x in dir(self) if x != "params"])
     
    -    def _merge_params(self, params):
    -        paramMap = self.paramMap.copy()
    -        paramMap.update(params)
    +    def _explain(self, param):
    +        """
    +        Explains a single param and returns its name, doc, and optional
    +        default value and user-supplied value in a string.
    +        """
    +        param = self._resolveParam(param)
    +        values = []
    +        if self.isDefined(param):
    +            if param in self.defaultParamMap:
    +                values.append("default: %s" % self.defaultParamMap[param])
    +            if param in self.paramMap:
    +                values.append("current: %s" % self.paramMap[param])
    +        else:
    +            values.append("undefined")
    +        valueStr = "(" + ", ".join(values) + ")"
    +        return "%s: %s %s" % (param.name, param.doc, valueStr)
    +
    +    def explainParams(self):
    +        """
    +        Returns the documentation of all params with their optionally
    +        default values and user-supplied values.
    +        """
    +        return "\n".join([self._explain(param) for param in self.params])
    +
    +    def getParam(self, paramName):
    +        """
    +        Gets a param by its name.
    +        """
    +        param = getattr(self, paramName)
    +        if isinstance(param, Param):
    +            return param
    +        else:
    +            raise ValueError("Cannot find param with name %s." % paramName)
    +
    +    def isSet(self, param):
    +        """
    +        Checks whether a param is explicitly set by user.
    +        """
    +        param = self._resolveParam(param)
    +        return param in self.paramMap
    +
    +    def hasDefault(self, param):
    +        """
    +        Checks whether a param has a default value.
    +        """
    +        param = self._resolveParam(param)
    +        return param in self.defaultParamMap
    +
    +    def isDefined(self, param):
    +        """
    +        Checks whether a param is explicitly set by user or has a default value.
    +        """
    +        return self.isSet(param) or self.hasDefault(param)
    +
    +    def getOrDefault(self, param):
    +        """
    +        Gets the value of a param in the user-supplied param map or its
    +        default value. Raises an error if either is set.
    +        """
    +        if isinstance(param, Param):
    +            if param in self.paramMap:
    +                return self.paramMap[param]
    +            else:
    +                return self.defaultParamMap[param]
    +        elif isinstance(param, str):
    +            return self.getOrDefault(self.getParam(param))
    +        else:
    +            raise KeyError("Cannot recognize %r as a param." % param)
    +
    +    def extractParamMap(self, extraParamMap={}):
    +        """
    +        Extracts the embedded default param values and user-supplied
    +        values, and then merges them with extra values from input into
    +        a flat param map, where the latter values is used if there
    +        exist conflicts, i.e., with ordering: default param values <
    +        user-supplied values < extraParamMap.
    +        :param extraParamMap: extra param values
    +        :return: merged param map
    +        """
    +        paramMap = self.defaultParamMap.copy()
    +        paramMap.update(self.paramMap)
    +        paramMap.update(extraParamMap)
             return paramMap
     
    +    def _shouldOwn(self, param):
    +        """
    +        Validates that the input param belongs to this Params instance.
    +        """
    +        if param.parent is not self:
    +            raise ValueError("Param %r does not belong to %r." % (param, self))
    +
    +    def _resolveParam(self, param):
    +        """
    +        Resolves a param and validates the ownership.
    +        :param param: param name or the param instance, which must
    +                      belongs to this Params instance
    --- End diff --
    
    "belongs" --> "belong"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-6893][ML] default pipeline parameter ha...

Posted by jkbradley <gi...@git.apache.org>.

Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5534#discussion_r28470169
  
    --- Diff: python/pyspark/ml/util.py ---
    @@ -41,7 +41,7 @@ class Identifiable(object):
         def __init__(self):
             #: A unique id for the object. The default implementation
             #: concatenates the class name, "-", and 8 random hex chars.
    --- End diff --
    
    Update doc: "-" --> "_"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-6893][ML] default pipeline parameter ha...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5534#issuecomment-93657516
  
      [Test build #30396 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30396/consoleFull) for   PR 5534 at commit [`d3b519b`](https://github.com/apache/spark/commit/d3b519bfdcbfb54ac3f2b37b4cf500643d7a35de).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.
     * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-6893][ML] default pipeline parameter ha...

Posted by mengxr <gi...@git.apache.org>.

Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5534#discussion_r28484300
  
    --- Diff: python/pyspark/ml/param/__init__.py ---
    @@ -52,10 +50,11 @@ class Params(Identifiable):
     
         __metaclass__ = ABCMeta
     
    -    def __init__(self):
    -        super(Params, self).__init__()
    -        #: embedded param map
    -        self.paramMap = {}
    +    #: internal param map for user-supplied values param map
    +    paramMap = {}
    +
    +    #: internal param map for default values
    +    defaultParamMap = {}
     
         @property
         def params(self):
    --- End diff --
    
    I use `dir` here, which returns a sorted list.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-6893][ML] default pipeline parameter ha...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5534#issuecomment-93554193
  
      [Test build #30370 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30370/consoleFull) for   PR 5534 at commit [`ebaccc6`](https://github.com/apache/spark/commit/ebaccc6398c81b13af775769205c0fd19701138d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-6893][ML] default pipeline parameter ha...

Posted by jkbradley <gi...@git.apache.org>.

Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5534#discussion_r28468531
  
    --- Diff: python/pyspark/ml/param/__init__.py ---
    @@ -52,10 +50,11 @@ class Params(Identifiable):
     
         __metaclass__ = ABCMeta
     
    -    def __init__(self):
    -        super(Params, self).__init__()
    -        #: embedded param map
    -        self.paramMap = {}
    +    #: internal param map for user-supplied values param map
    +    paramMap = {}
    --- End diff --
    
    Curious: Why remove the init method and the call to ```Identifiable.__init__```?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org