You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by erikerlandson <gi...@git.apache.org> on 2014/08/12 21:17:06 UTC

[GitHub] spark pull request: [SPARK-2991] Implement RDD lazy transforms for...

GitHub user erikerlandson opened a pull request:

    https://github.com/apache/spark/pull/1909

    [SPARK-2991] Implement RDD lazy transforms for scanLeft and scan

    Discussion of implementations:
    http://erikerlandson.github.io/blog/2014/08/09/implementing-an-rdd-scanleft-transform-with-cascade-rdds/
    http://erikerlandson.github.io/blog/2014/08/12/implementing-parallel-prefix-scan-as-a-spark-rdd-transform/


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/erikerlandson/spark spark-2991-pr

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/1909.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1909
    
----
commit 4555c5b080bc4adeb198c57aeca6e86ca862f545
Author: Erik Erlandson <ee...@redhat.com>
Date:   2014-08-02T01:43:28Z

    [SPARK-2991] Implement RDD lazy transforms for scanLeft and scan

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2991] Implement RDD lazy transforms for...

Posted by markhamstra <gi...@git.apache.org>.
Github user markhamstra commented on the pull request:

    https://github.com/apache/spark/pull/1909#issuecomment-51965072
  
    Erik, you've been doing some great work on making non-lazy transforms lazy!  I haven't had time to thoroughly review your recent PRs, but can you do some checks and probably add some tests to make sure that all of your recent efforts also work correctly not only for synchronous actions on RDDs (collect, count, et al), but also the async actions in AsyncRDDActions.scala?  It looked to me like at least the RangePartitioner work, while better than what is in Spark now, still had some troubles in async actions (essentially, the production of the rangeBounds doesn't get captured within the FutureAction, so it isn't cancellable, etc.)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2991] Implement RDD lazy transforms for...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1909#issuecomment-54694477
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2991] Implement RDD lazy transforms for...

Posted by petro-rudenko <gi...@git.apache.org>.
Github user petro-rudenko commented on the pull request:

    https://github.com/apache/spark/pull/1909#issuecomment-90063723
  
    +1 for this. Useful feature to calculate distributed cumulative sum.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2991] Implement RDD lazy transforms for...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/1909#issuecomment-51963308
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2991] Implement RDD lazy transforms for...

Posted by erikerlandson <gi...@git.apache.org>.
Github user erikerlandson commented on the pull request:

    https://github.com/apache/spark/pull/1909#issuecomment-51965848
  
    Good point, I will look into those.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2991] Implement RDD lazy transforms for...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/1909#issuecomment-51971006
  
    @erikerlandson  perhaps also create an umbrella ticket and make all the related tickets a sub-task for the umbrella one? This way it is a lot easier to track them. Cheers.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2991] Implement RDD lazy transforms for...

Posted by erikerlandson <gi...@git.apache.org>.
Github user erikerlandson commented on the pull request:

    https://github.com/apache/spark/pull/1909#issuecomment-51977051
  
    @rxin I created an umbrella:
    https://issues.apache.org/jira/browse/SPARK-2992



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org