You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by mengxr <gi...@git.apache.org> on 2015/11/04 01:21:54 UTC

[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

GitHub user mengxr opened a pull request:

    https://github.com/apache/spark/pull/9454

    [WIP][SPARK-11217][ML] save/load for non-meta estimators and transformers

    This PR implements the default save/load for non-meta estimators and transformers using the JSON serialization of param values. The saved metadata includes:
    
    * class name
    * uid
    * timestamp
    * paramMap
    
    The save/load interface is similar to DataFrames. We use the current active context by default, which should be sufficient for most use cases.
    
    ~~~scala
    instance.save.to("path")
    instance.save.options("overwrite" -> "true").to("path")
    instance.save.context(sqlContext).to("path")
    
    Instance.load.from("path")
    ~~~
    
    The param handling is different from the design doc. We didn't save default and user-set params separately, and when we load it back, all parameters are user-set. This does cause issues. But it also cause other issues if we modify the default params.
    
    TODOs:
    
    * [ ] Java test
    * [ ] a follow-up PR to implement default save/load for all non-meta estimators and transformers
    
    cc @jkbradley 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mengxr/spark SPARK-11217

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/9454.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #9454
    
----
commit cd1c7eae3246f93b6ee4e044443adfe57fdf1386
Author: Xiangrui Meng <me...@databricks.com>
Date:   2015-11-03T18:56:22Z

    initial implementation

commit df81d61f73c6a854913df638770f0b0409f046a3
Author: Xiangrui Meng <me...@databricks.com>
Date:   2015-11-03T23:41:58Z

    update doc and test

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11217][ML] save/load for non-meta estim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154471703
  
    Build started sha1 is merged.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154125211
  
    **[Test build #45124 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45124/consoleFull)** for PR 9454 at commit [`bc8611d`](https://github.com/apache/spark/commit/bc8611d070e58326a31ef6c1d7f95d043839b42e).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `abstract class Writer extends BaseReadWrite `\n  * `trait Writable `\n  * `abstract class Reader[T] extends BaseReadWrite `\n  * `trait Readable[T] `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

Posted by mengxr <gi...@git.apache.org>.
Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9454#discussion_r43827580
  
    --- Diff: mllib/src/main/scala/org/apache/spark/ml/param/params.scala ---
    @@ -592,7 +592,7 @@ trait Params extends Identifiable with Serializable {
       /**
        * Sets a parameter in the embedded param map.
        */
    -  protected final def set[T](param: Param[T], value: T): this.type = {
    +  final def set[T](param: Param[T], value: T): this.type = {
    --- End diff --
    
    it is not feasible to set an arbitrary param outside the instance.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11217][ML] save/load for non-meta estim...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-155299003
  
    @mengxr  As I implemented save/load for logreg, I found some things I'd like to change: [https://issues.apache.org/jira/browse/SPARK-11618].  I'll write a PR for these changes before logreg.  They should not affect our various PRs too much.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-153536443
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-153532873
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9454#discussion_r44074314
  
    --- Diff: mllib/src/test/scala/org/apache/spark/ml/util/DefaultReadWriteTest.scala ---
    @@ -0,0 +1,103 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.ml.util
    +
    +import java.io.{File, IOException}
    +
    +import org.scalatest.Suite
    +
    +import org.apache.spark.SparkFunSuite
    +import org.apache.spark.ml.param._
    +import org.apache.spark.mllib.util.MLlibTestSparkContext
    +
    +trait DefaultReadWriteTest extends TempDirectory { self: Suite =>
    +
    +  /**
    +   * Checks "overwrite" option and params.
    +   * @param instance ML instance to test saving/loading
    +   * @tparam T ML instance type
    +   */
    +  def testDefaultReadWrite[T <: Params with Writable](instance: T): Unit = {
    +    val uid = instance.uid
    +    val path = new File(tempDir, uid).getPath
    +
    +    instance.write.to(path)
    +    intercept[IOException] {
    +      instance.write.to(path)
    +    }
    +    instance.write.overwrite().to(path)
    +    val loader = instance.getClass.getMethod("read").invoke(null).asInstanceOf[Reader[T]]
    +    val newInstance = loader.from(path)
    +
    +    assert(newInstance.uid === instance.uid)
    +    instance.params.foreach { p =>
    +      if (instance.isDefined(p)) {
    +        (instance.getOrDefault(p), newInstance.getOrDefault(p)) match {
    +          case (Array(values), Array(newValues)) =>
    +            assert(values !== newValues, s"Values do not match on param ${p.name}.")
    --- End diff --
    
    should be ===


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9454#discussion_r44074319
  
    --- Diff: mllib/src/test/scala/org/apache/spark/ml/util/DefaultReadWriteTest.scala ---
    @@ -0,0 +1,103 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.ml.util
    +
    +import java.io.{File, IOException}
    +
    +import org.scalatest.Suite
    +
    +import org.apache.spark.SparkFunSuite
    +import org.apache.spark.ml.param._
    +import org.apache.spark.mllib.util.MLlibTestSparkContext
    +
    +trait DefaultReadWriteTest extends TempDirectory { self: Suite =>
    +
    +  /**
    +   * Checks "overwrite" option and params.
    +   * @param instance ML instance to test saving/loading
    +   * @tparam T ML instance type
    +   */
    +  def testDefaultReadWrite[T <: Params with Writable](instance: T): Unit = {
    +    val uid = instance.uid
    +    val path = new File(tempDir, uid).getPath
    +
    +    instance.write.to(path)
    +    intercept[IOException] {
    +      instance.write.to(path)
    +    }
    +    instance.write.overwrite().to(path)
    +    val loader = instance.getClass.getMethod("read").invoke(null).asInstanceOf[Reader[T]]
    +    val newInstance = loader.from(path)
    +
    +    assert(newInstance.uid === instance.uid)
    +    instance.params.foreach { p =>
    +      if (instance.isDefined(p)) {
    +        (instance.getOrDefault(p), newInstance.getOrDefault(p)) match {
    +          case (Array(values), Array(newValues)) =>
    +            assert(values !== newValues, s"Values do not match on param ${p.name}.")
    +          case (value, newValue) =>
    +            assert(value === newValue, s"Values do not match on param ${p.name}.")
    +        }
    +      } else {
    +        assert(!newInstance.isDefined(p), s"Param ${p.name} shouldn't be defined.")
    +      }
    +    }
    +  }
    +}
    +
    +class MyParams(override val uid: String) extends Params with Writable {
    +
    +  final val intParamWithDefault: IntParam = new IntParam(this, "intParamWithDefault", "doc")
    +  final val intParam: IntParam = new IntParam(this, "intParam", "doc")
    +  final val floatParam: FloatParam = new FloatParam(this, "floatParam", "doc")
    +  final val doubleParam: DoubleParam = new DoubleParam(this, "doubleParam", "doc")
    +  final val longParam: LongParam = new LongParam(this, "longParam", "doc")
    +  final val stringParam: Param[String] = new Param[String](this, "stringParam", "doc")
    +  final val intArrayParam: IntArrayParam = new IntArrayParam(this, "intArrayParam", "doc")
    +  final val doubleArrayParam: DoubleArrayParam =
    +    new DoubleArrayParam(this, "doubleArrayParam", "doc")
    +  final val stringArrayParam: StringArrayParam =
    +    new StringArrayParam(this, "stringArrayParam", "doc")
    +
    +  setDefault(intParamWithDefault -> 0)
    +  set(intParam -> 1)
    +  set(floatParam -> 2.0f)
    +  set(doubleParam -> 3.0)
    +  set(longParam -> 4L)
    +  set(stringParam -> "5")
    +  set(intArrayParam -> Array(6, 7))
    +  set(doubleArrayParam -> Array(8.0, 9.0))
    +  set(stringArrayParam -> Array("10", "11"))
    +
    +  override def copy(extra: ParamMap): Params = defaultCopy(extra)
    +
    +  override def write: Writer = new DefaultParamsWriter(this)
    +}
    +
    +object MyParams {
    --- End diff --
    
    extend Readable


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-153537621
  
    **[Test build #44980 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44980/consoleFull)** for PR 9454 at commit [`e01e92d`](https://github.com/apache/spark/commit/e01e92d92f3f799356dd6a8cebc60002899090e9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11217][ML] save/load for non-meta estim...

Posted by mengxr <gi...@git.apache.org>.
Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9454#discussion_r44159597
  
    --- Diff: mllib/src/test/scala/org/apache/spark/ml/util/DefaultReadWriteTest.scala ---
    @@ -0,0 +1,103 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.ml.util
    +
    +import java.io.{File, IOException}
    +
    +import org.scalatest.Suite
    +
    +import org.apache.spark.SparkFunSuite
    +import org.apache.spark.ml.param._
    +import org.apache.spark.mllib.util.MLlibTestSparkContext
    +
    +trait DefaultReadWriteTest extends TempDirectory { self: Suite =>
    +
    +  /**
    +   * Checks "overwrite" option and params.
    +   * @param instance ML instance to test saving/loading
    +   * @tparam T ML instance type
    +   */
    +  def testDefaultReadWrite[T <: Params with Writable](instance: T): Unit = {
    +    val uid = instance.uid
    +    val path = new File(tempDir, uid).getPath
    +
    +    instance.write.to(path)
    +    intercept[IOException] {
    +      instance.write.to(path)
    +    }
    +    instance.write.overwrite().to(path)
    +    val loader = instance.getClass.getMethod("read").invoke(null).asInstanceOf[Reader[T]]
    +    val newInstance = loader.from(path)
    +
    +    assert(newInstance.uid === instance.uid)
    +    instance.params.foreach { p =>
    +      if (instance.isDefined(p)) {
    +        (instance.getOrDefault(p), newInstance.getOrDefault(p)) match {
    +          case (Array(values), Array(newValues)) =>
    +            assert(values !== newValues, s"Values do not match on param ${p.name}.")
    --- End diff --
    
    Sorry! This was for debugging. I forgot to change it back before push.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11217][ML] save/load for non-meta estim...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154567630
  
    Spark merge script not happy.  Maybe a conflict was just introduced?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11217][ML] save/load for non-meta estim...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154474146
  
    **[Test build #45225 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45225/consoleFull)** for PR 9454 at commit [`f862b6a`](https://github.com/apache/spark/commit/f862b6a997faeaa3df770779549193eae20f38d7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11217][ML] save/load for non-meta estim...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/9454


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11217][ML] save/load for non-meta estim...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154476517
  
    **[Test build #45224 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45224/consoleFull)** for PR 9454 at commit [`a410538`](https://github.com/apache/spark/commit/a41053860ee0540bfff97d8e7e13100c2110c8df).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `abstract class Writer extends BaseReadWrite `\n  * `trait Writable `\n  * `abstract class Reader[T] extends BaseReadWrite `\n  * `trait Readable[T] `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11217][ML] save/load for non-meta estim...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154561645
  
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-153532562
  
    **[Test build #44978 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44978/consoleFull)** for PR 9454 at commit [`df81d61`](https://github.com/apache/spark/commit/df81d61f73c6a854913df638770f0b0409f046a3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11217][ML] save/load for non-meta estim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154457322
  
    Build started sha1 is merged.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9454#discussion_r44074302
  
    --- Diff: mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala ---
    @@ -0,0 +1,212 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.ml.util
    +
    +import java.{util => ju}
    +import java.io.IOException
    +
    +import scala.annotation.varargs
    +import scala.collection.mutable
    +import scala.collection.JavaConverters._
    +
    +import org.apache.hadoop.fs.{FileSystem, Path}
    +import org.json4s._
    +import org.json4s.JsonDSL._
    +import org.json4s.jackson.JsonMethods._
    +
    +import org.apache.spark.{Logging, SparkContext}
    +import org.apache.spark.annotation.{Experimental, Since}
    +import org.apache.spark.ml.param.{ParamPair, Params}
    +import org.apache.spark.sql.SQLContext
    +import org.apache.spark.util.Utils
    +
    +/**
    + * Trait for [[Writer]] and [[Reader]].
    + */
    +private[util] sealed trait BaseReadWrite {
    +  private var optionSQLContext: Option[SQLContext] = None
    +
    +  /**
    +   * Sets the SQL context to use for saving/loading.
    +   */
    +  @Since("1.6.0")
    +  def context(sqlContext: SQLContext): this.type = {
    +    optionSQLContext = Option(sqlContext)
    +    this
    +  }
    +
    +  /**
    +   * Returns the user-specified SQL context or the default.
    +   */
    +  protected final def sqlContext: SQLContext = optionSQLContext.getOrElse {
    +    SQLContext.getOrCreate(SparkContext.getOrCreate())
    +  }
    +}
    +
    +/**
    + * Abstract class for utility classes that can save ML instances.
    + */
    +@Experimental
    +@Since("1.6.0")
    +abstract class Writer extends BaseReadWrite {
    +
    +  protected var shouldOverwrite: Boolean = false
    +
    +  /**
    +   * Saves the ML instance to the input path.
    +   */
    +  @Since("1.6.0")
    +  @throws[IOException]("If the input path already exists but overwrite is not enabled.")
    +  def to(path: String): Unit
    +
    +  /**
    +   * Saves the ML instances to the input path, the same as [[to()]].
    +   */
    +  @Since("1.6.0")
    +  @throws[IOException]("If the input path already exists but overwrite is not enabled.")
    +  def save(path: String): Unit = to(path)
    +
    +  /**
    +   * Overwrites if the output path already exists.
    +   */
    +  def overwrite(): this.type = {
    +    shouldOverwrite = true
    +    this
    +  }
    +}
    +
    +/**
    + * Trait for classes that provide [[Writer]].
    + */
    +@Since("1.6.0")
    +trait Writable {
    +
    +  /**
    +   * Returns a [[Writer]] instance for this ML instance.
    +   */
    +  @Since("1.6.0")
    +  def write: Writer
    +}
    +
    +/**
    + * Abstract class for utility classes that can load ML instances.
    + * @tparam T ML instance type
    + */
    +@Experimental
    +@Since("1.6.0")
    +abstract class Reader[T] extends BaseReadWrite {
    +
    +  /**
    +   * Loads the ML component from the input path.
    +   */
    +  @Since("1.6.0")
    +  def from(path: String): T
    +
    +  /**
    +   * Loads the ML component from the input path, the same as [[from()]].
    +   */
    +  def load(path: String): T = from(path)
    +}
    +
    +/**
    + * Trait for objects that provide [[Reader]].
    + * @tparam T ML instance type
    + */
    +@Experimental
    +@Since("1.6.0")
    +trait Readable[T] {
    +  
    +  /**
    +   * Returns a [[Reader]] instance for this class.
    +   */
    +  @Since("1.6.0")
    +  def read: Reader[T]
    +}
    +
    +/**
    + * Default [[Writer]] implementation for non-meta transformers and estimators.
    --- End diff --
    
    "non-meta transformers and estimators" --> "transformers and estimators which contain basic (json4s-serializable) Params and no data.  This will not handle more complex Params or types with data (e.g., models with coefficients)."


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11217][ML] save/load for non-meta estim...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154461001
  
    **[Test build #45224 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45224/consoleFull)** for PR 9454 at commit [`a410538`](https://github.com/apache/spark/commit/a41053860ee0540bfff97d8e7e13100c2110c8df).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11217][ML] save/load for non-meta estim...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154567294
  
    I'll merge this with branch-1.6 and master


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-153536434
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11217][ML] save/load for non-meta estim...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154567840
  
    Nevermind, second time's the charm


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154125219
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45124/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11217][ML] save/load for non-meta estim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154516174
  
    Build triggered. sha1 is merged.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154161932
  
    Reviewing now


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154203756
  
    That's it; just minor comments


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154122990
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-153543927
  
    **[Test build #44980 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44980/consoleFull)** for PR 9454 at commit [`e01e92d`](https://github.com/apache/spark/commit/e01e92d92f3f799356dd6a8cebc60002899090e9).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `abstract class Saver extends BaseSaveLoad `\n  * `trait Saveable `\n  * `abstract class Loader[T] extends BaseSaveLoad `\n  * `trait Loadable[T] `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-153532869
  
    **[Test build #44978 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44978/consoleFull)** for PR 9454 at commit [`df81d61`](https://github.com/apache/spark/commit/df81d61f73c6a854913df638770f0b0409f046a3).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `abstract class Saver extends BaseSaveLoad `\n  * `trait Saveable `\n  * `abstract class Loader[T] extends BaseSaveLoad `\n  * `trait Loadable[T] `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-153532876
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44978/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11217][ML] save/load for non-meta estim...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154517432
  
    **[Test build #45244 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45244/consoleFull)** for PR 9454 at commit [`7952bd4`](https://github.com/apache/spark/commit/7952bd40dd98304aadda25df349627ad938e0cb5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11217][ML] save/load for non-meta estim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154482534
  
    
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45225/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154180741
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45140/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11217][ML] save/load for non-meta estim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154482532
  
    Build finished. 824 tests run, 0 skipped, 2 failed.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11217][ML] save/load for non-meta estim...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154530695
  
    **[Test build #45244 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45244/consoleFull)** for PR 9454 at commit [`7952bd4`](https://github.com/apache/spark/commit/7952bd40dd98304aadda25df349627ad938e0cb5).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `abstract class Writer extends BaseReadWrite `\n  * `trait Writable `\n  * `abstract class Reader[T] extends BaseReadWrite `\n  * `trait Readable[T] `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11217][ML] save/load for non-meta estim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154471676
  
    Build triggered. sha1 is merged.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154177277
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11217][ML] save/load for non-meta estim...

Posted by mengxr <gi...@git.apache.org>.
Github user mengxr commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154470661
  
    @jkbradley I removed `from` and `to` because `from` is a Python keyword. Instead, I added `load(path)` and `save(path)` as shortcuts.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-153544082
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44980/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11217][ML] save/load for non-meta estim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154531315
  
    Build finished. No test results found.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-153532178
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11217][ML] save/load for non-meta estim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154516233
  
    Build started sha1 is merged.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

Posted by mengxr <gi...@git.apache.org>.
Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9454#discussion_r44154791
  
    --- Diff: mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala ---
    @@ -0,0 +1,212 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.ml.util
    +
    +import java.{util => ju}
    +import java.io.IOException
    +
    +import scala.annotation.varargs
    +import scala.collection.mutable
    +import scala.collection.JavaConverters._
    +
    +import org.apache.hadoop.fs.{FileSystem, Path}
    +import org.json4s._
    +import org.json4s.JsonDSL._
    +import org.json4s.jackson.JsonMethods._
    +
    +import org.apache.spark.{Logging, SparkContext}
    +import org.apache.spark.annotation.{Experimental, Since}
    +import org.apache.spark.ml.param.{ParamPair, Params}
    +import org.apache.spark.sql.SQLContext
    +import org.apache.spark.util.Utils
    +
    +/**
    + * Trait for [[Writer]] and [[Reader]].
    + */
    +private[util] sealed trait BaseReadWrite {
    +  private var optionSQLContext: Option[SQLContext] = None
    +
    +  /**
    +   * Sets the SQL context to use for saving/loading.
    +   */
    +  @Since("1.6.0")
    +  def context(sqlContext: SQLContext): this.type = {
    --- End diff --
    
    This function could be overloaded for `context(SparkContext)`, `context(JavaSparkContext)`, or in the future `context(MLContext)`. So I call it `context` to avoid having multiple method names.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154177305
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11217][ML] save/load for non-meta estim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154457267
  
    Build triggered. sha1 is merged.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154124601
  
    **[Test build #45124 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45124/consoleFull)** for PR 9454 at commit [`bc8611d`](https://github.com/apache/spark/commit/bc8611d070e58326a31ef6c1d7f95d043839b42e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154123036
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11217][ML] save/load for non-meta estim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154476688
  
    Build finished. 958 tests run, 0 skipped, 0 failed.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11217][ML] save/load for non-meta estim...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154482463
  
    **[Test build #45225 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45225/consoleFull)** for PR 9454 at commit [`f862b6a`](https://github.com/apache/spark/commit/f862b6a997faeaa3df770779549193eae20f38d7).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `abstract class Writer extends BaseReadWrite `\n  * `trait Writable `\n  * `abstract class Reader[T] extends BaseReadWrite `\n  * `trait Readable[T] `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9454#discussion_r44074311
  
    --- Diff: mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala ---
    @@ -0,0 +1,212 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.ml.util
    +
    +import java.{util => ju}
    +import java.io.IOException
    +
    +import scala.annotation.varargs
    +import scala.collection.mutable
    +import scala.collection.JavaConverters._
    +
    +import org.apache.hadoop.fs.{FileSystem, Path}
    +import org.json4s._
    +import org.json4s.JsonDSL._
    +import org.json4s.jackson.JsonMethods._
    +
    +import org.apache.spark.{Logging, SparkContext}
    +import org.apache.spark.annotation.{Experimental, Since}
    +import org.apache.spark.ml.param.{ParamPair, Params}
    +import org.apache.spark.sql.SQLContext
    +import org.apache.spark.util.Utils
    +
    +/**
    + * Trait for [[Writer]] and [[Reader]].
    + */
    +private[util] sealed trait BaseReadWrite {
    +  private var optionSQLContext: Option[SQLContext] = None
    +
    +  /**
    +   * Sets the SQL context to use for saving/loading.
    +   */
    +  @Since("1.6.0")
    +  def context(sqlContext: SQLContext): this.type = {
    --- End diff --
    
    rename to setContext or setSqlContext?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-153544081
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11217][ML] save/load for non-meta estim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154531333
  
    
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45244/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11217][ML] save/load for non-meta estim...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154476691
  
    
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45224/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-153532152
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154180739
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP][SPARK-11217][ML] save/load for non-meta ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9454#issuecomment-154125215
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org