You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by izendejas <gi...@git.apache.org> on 2014/04/22 03:13:49 UTC

[GitHub] spark pull request: Minor optimizations. Use safer take, tail meth...

GitHub user izendejas opened a pull request:

    https://github.com/apache/spark/pull/473

    Minor optimizations. Use safer take, tail methods.

    Per an email thread I initiated and after feedback and clearing my ICLA, I'm requesting some minor changes be pulled to use the more readable and safer take/tail calls over slice. Also, found an unnecessary Seq instantiation to compute the min of two values. Got rid of reverse sortBys which used negation as they weren't as readable.
    
    Did not create a JIRA as these are very minor, but can do so.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/izendejas/spark master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/473.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #473
    
----
commit 81065aed9987c1b08cd5784b7a6153e26f3f7402
Author: Ignacio Zendejas <ig...@gmail.com>
Date:   2014-04-10T20:43:07Z

    Some minor optimizations to hopefully initiate my first PR and
    familiarize myself with the process, mostly.
    
    * got rid of SeqLike.reverse calls when sorting by descending order
    * replaced slice(1, length) with safer and more readable tail calls
    * used foldLeft when aggregating num of docs in naive bayes code

commit 36888799b95282a5d4bbb5c4d745b2118ca96fa7
Author: Ignacio Zendejas <ig...@gmail.com>
Date:   2014-04-22T00:11:37Z

    reverted changes per feedback prior to quick pr

commit 76833d86e72e4e554272b6c376331800f9e40324
Author: Ignacio Zendejas <ig...@gmail.com>
Date:   2014-04-22T00:46:14Z

    Merge branch 'master' into sort_desc_optimizations
    
    Conflicts:
    	core/src/main/scala/org/apache/spark/ui/jobs/JobProgressPage.scala
    	core/src/main/scala/org/apache/spark/ui/jobs/PoolPage.scala
    	mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Minor optimizations. Use safer take, tail meth...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/473#issuecomment-41132176
  
    Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Minor optimizations. Use safer take, tail meth...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/473#issuecomment-50947599
  
    PS if you need a JIRA to attach this to, it's arguably similar to https://issues.apache.org/jira/browse/SPARK-2799 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Minor optimizations. Use safer take, tail meth...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/473#issuecomment-41132118
  
    Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Minor optimizations. Use safer take, tail meth...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/473#issuecomment-41136740
  
    Merged build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Minor optimizations. Use safer take, tail meth...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/473#issuecomment-40994587
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Minor optimizations. Use safer take, tail meth...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/473#issuecomment-50695794
  
    This is quite out of date, so I think we should close this until there is further activity.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Minor optimizations. Use safer take, tail meth...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/473#discussion_r11889136
  
    --- Diff: mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala ---
    @@ -1120,7 +1120,7 @@ object DecisionTree extends Serializable with Logging {
         sc.textFile(dir).map { line =>
           val parts = line.trim().split(",")
           val label = parts(0).toDouble
    -      val features = Vectors.dense(parts.slice(1,parts.length).map(_.toDouble))
    +      val features = Vectors.dense(parts.tail.map(_.toDouble))
    --- End diff --
    
    by two arrays, i meant one in parts, and the other for the tail, and the map. We should be able to skip 2 of them.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Minor optimizations. Use safer take, tail meth...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/473#issuecomment-50733349
  
    I'd be happy to resuscitate this along with a few similar Scala operation simplifications.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Minor optimizations. Use safer take, tail meth...

Posted by izendejas <gi...@git.apache.org>.
Github user izendejas commented on the pull request:

    https://github.com/apache/spark/pull/473#issuecomment-50830204
  
    My apologies. I had to get clearance from my current employer and then I
    totally forgot about it. I'll merge later today. It's quick stuff that was
    meant to just get me familiar with the committing process.
    
    
    On Thu, Jul 31, 2014 at 1:48 AM, Sean Owen <no...@github.com> wrote:
    
    > I'd be happy to resuscitate this along with a few similar Scala operation
    > simplifications.
    >
    > —
    > Reply to this email directly or view it on GitHub
    > <https://github.com/apache/spark/pull/473#issuecomment-50733349>.
    >


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Minor optimizations. Use safer take, tail meth...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/473#issuecomment-41132163
  
     Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Minor optimizations. Use safer take, tail meth...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/473


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor optimizations. Use safer take, tail meth...

Posted by izendejas <gi...@git.apache.org>.
Github user izendejas commented on the pull request:

    https://github.com/apache/spark/pull/473#issuecomment-47125703
  
    Will do later today. Thanks.
    
    
    On Wed, Jun 25, 2014 at 1:16 AM, Reynold Xin <no...@github.com>
    wrote:
    
    > @izendejas <https://github.com/izendejas> do you mind updating the pull
    > request to address my comment? Everything else looks good.
    >
    > —
    > Reply to this email directly or view it on GitHub
    > <https://github.com/apache/spark/pull/473#issuecomment-47072778>.
    >


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Minor optimizations. Use safer take, tail meth...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/473#discussion_r11889119
  
    --- Diff: mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala ---
    @@ -1120,7 +1120,7 @@ object DecisionTree extends Serializable with Logging {
         sc.textFile(dir).map { line =>
           val parts = line.trim().split(",")
           val label = parts(0).toDouble
    -      val features = Vectors.dense(parts.slice(1,parts.length).map(_.toDouble))
    +      val features = Vectors.dense(parts.tail.map(_.toDouble))
    --- End diff --
    
    btw this is probably a part we can speed up and reduce gc by not creating two arrays ... 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Minor optimizations. Use safer take, tail meth...

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/473#issuecomment-41007764
  
    Jenkins, test this please. Thanks for the clean-up! Looks good to me, pending tests...



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Minor optimizations. Use safer take, tail meth...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/473#issuecomment-41136744
  
    All automated tests passed.
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14363/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Minor optimizations. Use safer take, tail meth...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/473#issuecomment-47072778
  
    @izendejas do you mind updating the pull request to address my comment? Everything else looks good.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---