You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by supremekai <gi...@git.apache.org> on 2016/07/29 01:34:07 UTC

[GitHub] spark pull request #14394: [SPARK-16786] [Python] [WIP] LDA topic distributi...

GitHub user supremekai opened a pull request:

    https://github.com/apache/spark/pull/14394

    [SPARK-16786] [Python] [WIP] LDA topic distributions API Call for python

    ## What changes were proposed in this pull request?
    
    Implemented python call to topicDistributions for pyspark.clustering.mllib.LDAModel
    
    ## How was this patch tested?
    Ran ./dev/run-tests, all passing
    Manually verified.
    Used function parameter types, return types etc. from existing API calls so all behaviour is consistent with existing behaviour.
    
    (If this patch involves UI changes, please attach a screenshot; otherwise, remove this)
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/supremekai/spark pyspark-topic-distributions

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/14394.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #14394
    
----
commit 00d93ccba2cfbc298f820d3d4391c4ad11211b4f
Author: Jordan <jo...@dotlovesdata.com>
Date:   2016-07-28T22:58:09Z

    Added pyspark API call to MLlib LDAModel topicDistributions function

commit 5f36d785a689d21cb4392f56a30afdc8188bbc2a
Author: Jordan <jo...@dotlovesdata.com>
Date:   2016-07-29T01:12:37Z

    Fixed imports and styling

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14394: [SPARK-16786] [Python] [WIP] LDA topic distributions API...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14394
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14394: [SPARK-16786] [Python] [WIP] LDA topic distributi...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14394#discussion_r82463086
  
    --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAModel.scala ---
    @@ -777,6 +777,10 @@ class DistributedLDAModel private[clustering] (
         JavaPairRDD.fromRDD(topicDistributions.asInstanceOf[RDD[(java.lang.Long, Vector)]])
       }
     
    +  override def topicDistributions(documents: RDD[(Long, Vector)]): RDD[(Long, Vector)] = {
    --- End diff --
    
    Is this what we want here? It seems having it defined on the parent if half of the children aren't implementing it might be confusing to some users.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14394: [SPARK-16786] [Python] [WIP] LDA topic distributions API...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on the issue:

    https://github.com/apache/spark/pull/14394
  
    @supremekai  Thanks for the PR!  I'm sorry about the inactivity on this.  However, now that it has been added to the DataFrame-based API (in pyspark.ml), we will not be adding it to the RDD-based API.  Could you please close this issue?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14394: [SPARK-16786] [Python] [WIP] LDA topic distributi...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/14394


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org