You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by viirya <gi...@git.apache.org> on 2018/04/25 10:10:42 UTC

[GitHub] spark pull request #21153: [SPARK-24058][ML][PySpark] Default Params in ML s...

GitHub user viirya opened a pull request:

    https://github.com/apache/spark/pull/21153

    [SPARK-24058][ML][PySpark] Default Params in ML should be saved separately: Python API

    ## What changes were proposed in this pull request?
    
    See SPARK-23455 for reference. Now default params in ML are saved separately in metadata file in Scala. We must change it for Python for Spark 2.4.0 as well in order to keep them in sync.
    
    ## How was this patch tested?
    
    Added test.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/viirya/spark-1 SPARK-24058

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21153.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21153
    
----
commit 0e9b18a7913c65e8a9b1c2304e2565a5ae9fbfbd
Author: Liang-Chi Hsieh <vi...@...>
Date:   2018-04-25T09:58:52Z

    Default Params in ML should be saved separately in Python API.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    **[Test build #4179 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4179/testReport)** for PR 21153 at commit [`ce84137`](https://github.com/apache/spark/commit/ce841372b76fe3263462b1f51ebfda26e098f8f3).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21153: [SPARK-24058][ML][PySpark] Default Params in ML s...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21153#discussion_r184632888
  
    --- Diff: python/pyspark/ml/util.py ---
    @@ -523,11 +534,29 @@ def getAndSetParams(instance, metadata):
             """
             Extract Params from metadata, and set them in the instance.
             """
    +        # User-supplied param values
             for paramName in metadata['paramMap']:
                 param = instance.getParam(paramName)
                 paramValue = metadata['paramMap'][paramName]
                 instance.set(param, paramValue)
     
    +        # Default param values
    +        majorAndMinorVersions = majorMinorVersion(metadata['sparkVersion'])
    +        assert majorAndMinorVersions is not None, "Error loading metadata: Expected " + \
    +            "Spark version string but found {}".format(metadata['sparkVersion'])
    +
    +        major = majorAndMinorVersions[0]
    +        minor = majorAndMinorVersions[1]
    +        # For metadata file prior to Spark 2.4, there is no default section.
    +        if major > 2 or (major == 2 and minor >= 4):
    +            assert 'defaultParamMap' in metadata, "Error loading metadata: Expected " + \
    +                "`defaultParamMap` section not found"
    +
    +            for paramName in metadata['defaultParamMap']:
    +                param = instance.getParam(paramName)
    +                paramValue = metadata['defaultParamMap'][paramName]
    +                instance._setDefault(**{param.name: paramValue})
    --- End diff --
    
    Oh, right. My bad.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    **[Test build #90612 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90612/testReport)** for PR 21153 at commit [`dc59375`](https://github.com/apache/spark/commit/dc593754c62d2daf89331ea21d9250af9b9febfd).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2718/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3212/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    **[Test build #90393 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90393/testReport)** for PR 21153 at commit [`86fa433`](https://github.com/apache/spark/commit/86fa433ce0098af14283776101f1229b9ea9d184).
     * This patch **fails Python style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    Not sure why AppVeyor build failed. How to re-trigger AppVeyor build?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90393/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21153: [SPARK-24058][ML][PySpark] Default Params in ML s...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21153#discussion_r188132284
  
    --- Diff: python/pyspark/ml/util.py ---
    @@ -396,6 +397,7 @@ def saveMetadata(instance, path, sc, extraMetadata=None, paramMap=None):
             - sparkVersion
             - uid
             - paramMap
    +        - defalutParamMap (since 2.4.0)
    --- End diff --
    
    Thanks.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21153: [SPARK-24058][ML][PySpark] Default Params in ML s...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/21153


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    Thanks @jkbradley @WeichenXu123 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3059/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90394/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89840/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2669/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    OK. Looks like AppVeyor build is ok now.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89841/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3060/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90612/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21153: [SPARK-24058][ML][PySpark] Default Params in ML s...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21153#discussion_r185262870
  
    --- Diff: python/pyspark/util.py ---
    @@ -61,6 +62,26 @@ def _get_argspec(f):
         return argspec
     
     
    +def majorMinorVersion(version):
    --- End diff --
    
    Yes please, a few things:
    * throw exception when unable to parse
    * make it a static method of a class pyspark.util.VersionUtils
    * add unit test


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    **[Test build #90486 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90486/testReport)** for PR 21153 at commit [`ce84137`](https://github.com/apache/spark/commit/ce841372b76fe3263462b1f51ebfda26e098f8f3).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    **[Test build #90394 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90394/testReport)** for PR 21153 at commit [`b47beab`](https://github.com/apache/spark/commit/b47beab00876d5282ef60eb4e88d7314749a481b).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    OK thanks @viirya !
    Merging with master



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    **[Test build #89924 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89924/testReport)** for PR 21153 at commit [`526fa4a`](https://github.com/apache/spark/commit/526fa4a96b61f4b5adb6606a92ed440879747a28).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90486/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21153: [SPARK-24058][ML][PySpark] Default Params in ML s...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21153#discussion_r185131834
  
    --- Diff: python/pyspark/util.py ---
    @@ -61,6 +62,26 @@ def _get_argspec(f):
         return argspec
     
     
    +def majorMinorVersion(version):
    --- End diff --
    
    Since this affects Spark core, it might be nice to put this in a separate PR.  Also, shall we make this match the Scala API?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    Done. Can you try again?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    **[Test build #90394 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90394/testReport)** for PR 21153 at commit [`b47beab`](https://github.com/apache/spark/commit/b47beab00876d5282ef60eb4e88d7314749a481b).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21153: [SPARK-24058][ML][PySpark] Default Params in ML s...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21153#discussion_r184693332
  
    --- Diff: python/pyspark/ml/util.py ---
    @@ -417,15 +419,24 @@ def _get_metadata_to_save(instance, sc, extraMetadata=None, paramMap=None):
             """
             uid = instance.uid
             cls = instance.__module__ + '.' + instance.__class__.__name__
    -        params = instance.extractParamMap()
    +
    +        # User-supplied param values
    +        params = instance._paramMap
             jsonParams = {}
             if paramMap is not None:
                 jsonParams = paramMap
             else:
                 for p in params:
                     jsonParams[p.name] = params[p]
    +
    +        # Default param values
    +        jsonDefaultParams = {}
    +        for p in instance._defaultParamMap:
    +            jsonDefaultParams[p.name] = instance._defaultParamMap[p]
    --- End diff --
    
    ditto.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21153: [SPARK-24058][ML][PySpark] Default Params in ML s...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21153#discussion_r187129517
  
    --- Diff: python/pyspark/ml/util.py ---
    @@ -396,6 +397,7 @@ def saveMetadata(instance, path, sc, extraMetadata=None, paramMap=None):
             - sparkVersion
             - uid
             - paramMap
    +        - defalutParamMap (since 2.4.0)
    --- End diff --
    
    typo: default


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21153: [SPARK-24058][ML][PySpark] Default Params in ML s...

Posted by WeichenXu123 <gi...@git.apache.org>.
Github user WeichenXu123 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21153#discussion_r184620855
  
    --- Diff: python/pyspark/ml/util.py ---
    @@ -417,15 +419,24 @@ def _get_metadata_to_save(instance, sc, extraMetadata=None, paramMap=None):
             """
             uid = instance.uid
             cls = instance.__module__ + '.' + instance.__class__.__name__
    -        params = instance.extractParamMap()
    +
    +        # User-supplied param values
    +        params = instance._paramMap
             jsonParams = {}
             if paramMap is not None:
                 jsonParams = paramMap
             else:
                 for p in params:
                     jsonParams[p.name] = params[p]
    +
    +        # Default param values
    +        jsonDefaultParams = {}
    +        for p in instance._defaultParamMap:
    +            jsonDefaultParams[p.name] = instance._defaultParamMap[p]
    --- End diff --
    
    similar, use `_defaultParamMap.copy()`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    **[Test build #89840 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89840/testReport)** for PR 21153 at commit [`0e9b18a`](https://github.com/apache/spark/commit/0e9b18a7913c65e8a9b1c2304e2565a5ae9fbfbd).
     * This patch **fails Python style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21153: [SPARK-24058][ML][PySpark] Default Params in ML s...

Posted by WeichenXu123 <gi...@git.apache.org>.
Github user WeichenXu123 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21153#discussion_r184620777
  
    --- Diff: python/pyspark/ml/util.py ---
    @@ -417,15 +419,24 @@ def _get_metadata_to_save(instance, sc, extraMetadata=None, paramMap=None):
             """
             uid = instance.uid
             cls = instance.__module__ + '.' + instance.__class__.__name__
    -        params = instance.extractParamMap()
    +
    +        # User-supplied param values
    +        params = instance._paramMap
             jsonParams = {}
             if paramMap is not None:
                 jsonParams = paramMap
             else:
                 for p in params:
                     jsonParams[p.name] = params[p]
    --- End diff --
    
    I think use `_paramMap.copy()` will be simpler.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21153: [SPARK-24058][ML][PySpark] Default Params in ML s...

Posted by WeichenXu123 <gi...@git.apache.org>.
Github user WeichenXu123 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21153#discussion_r184626842
  
    --- Diff: python/pyspark/ml/util.py ---
    @@ -523,11 +534,29 @@ def getAndSetParams(instance, metadata):
             """
             Extract Params from metadata, and set them in the instance.
             """
    +        # User-supplied param values
             for paramName in metadata['paramMap']:
                 param = instance.getParam(paramName)
                 paramValue = metadata['paramMap'][paramName]
                 instance.set(param, paramValue)
     
    +        # Default param values
    +        majorAndMinorVersions = majorMinorVersion(metadata['sparkVersion'])
    +        assert majorAndMinorVersions is not None, "Error loading metadata: Expected " + \
    +            "Spark version string but found {}".format(metadata['sparkVersion'])
    +
    +        major = majorAndMinorVersions[0]
    +        minor = majorAndMinorVersions[1]
    +        # For metadata file prior to Spark 2.4, there is no default section.
    +        if major > 2 or (major == 2 and minor >= 4):
    +            assert 'defaultParamMap' in metadata, "Error loading metadata: Expected " + \
    +                "`defaultParamMap` section not found"
    +
    +            for paramName in metadata['defaultParamMap']:
    +                param = instance.getParam(paramName)
    +                paramValue = metadata['defaultParamMap'][paramName]
    +                instance._setDefault(**{param.name: paramValue})
    --- End diff --
    
    remove line `param = instance.getParam(paramName)` and change this line to
    `instance._setDefault(**{paramName: paramValue})`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    **[Test build #89841 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89841/testReport)** for PR 21153 at commit [`f32350f`](https://github.com/apache/spark/commit/f32350f9ed7c360e62d58441447dbd2fd05dd506).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    **[Test build #90393 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90393/testReport)** for PR 21153 at commit [`86fa433`](https://github.com/apache/spark/commit/86fa433ce0098af14283776101f1229b9ea9d184).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    **[Test build #89841 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89841/testReport)** for PR 21153 at commit [`f32350f`](https://github.com/apache/spark/commit/f32350f9ed7c360e62d58441447dbd2fd05dd506).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    **[Test build #89840 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89840/testReport)** for PR 21153 at commit [`0e9b18a`](https://github.com/apache/spark/commit/0e9b18a7913c65e8a9b1c2304e2565a5ae9fbfbd).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21153: [SPARK-24058][ML][PySpark] Default Params in ML s...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21153#discussion_r184693307
  
    --- Diff: python/pyspark/ml/util.py ---
    @@ -417,15 +419,24 @@ def _get_metadata_to_save(instance, sc, extraMetadata=None, paramMap=None):
             """
             uid = instance.uid
             cls = instance.__module__ + '.' + instance.__class__.__name__
    -        params = instance.extractParamMap()
    +
    +        # User-supplied param values
    +        params = instance._paramMap
             jsonParams = {}
             if paramMap is not None:
                 jsonParams = paramMap
             else:
                 for p in params:
                     jsonParams[p.name] = params[p]
    --- End diff --
    
    `_paramMap`'s keys are `Param` not string.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2668/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21153: [SPARK-24058][ML][PySpark] Default Params in ML s...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21153#discussion_r185159691
  
    --- Diff: python/pyspark/util.py ---
    @@ -61,6 +62,26 @@ def _get_argspec(f):
         return argspec
     
     
    +def majorMinorVersion(version):
    --- End diff --
    
    > Also, shall we make this match the Scala API?
    Do you mean to throw exception when unable to parse Spark version?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    **[Test build #90612 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90612/testReport)** for PR 21153 at commit [`dc59375`](https://github.com/apache/spark/commit/dc593754c62d2daf89331ea21d9250af9b9febfd).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3119/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    **[Test build #90486 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90486/testReport)** for PR 21153 at commit [`ce84137`](https://github.com/apache/spark/commit/ce841372b76fe3263462b1f51ebfda26e098f8f3).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    Would you mind rebasing this off of the upstream master branch?  I'm having trouble running the tests for this PR locally.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    **[Test build #89924 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89924/testReport)** for PR 21153 at commit [`526fa4a`](https://github.com/apache/spark/commit/526fa4a96b61f4b5adb6606a92ed440879747a28).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89924/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    **[Test build #4179 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4179/testReport)** for PR 21153 at commit [`ce84137`](https://github.com/apache/spark/commit/ce841372b76fe3263462b1f51ebfda26e098f8f3).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21153: [SPARK-24058][ML][PySpark] Default Params in ML should b...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/21153
  
    cc @jkbradley @dbtsai 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org