You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by manygrams <gi...@git.apache.org> on 2015/10/27 04:53:40 UTC

[GitHub] spark pull request: [SPARK-11335][STREAMING] update kafka direct p...

GitHub user manygrams opened a pull request:

    https://github.com/apache/spark/pull/9289

    [SPARK-11335][STREAMING] update kafka direct python docs on how to get the offset ranges for a KafkaRDD

    @tdas @koeninger 
    
    This updates the Kafka Streaming integration docs with a working method to access the offsets of a `KafkaRDD` through Python.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/manygrams/spark update_kafka_direct_python_docs

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/9289.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #9289
    
----
commit acb202a1a2e28b28df16c6e5269c65c2da9dddc3
Author: Nick Evans <me...@nicolasevans.org>
Date:   2015-10-27T03:51:23Z

    update kafka direct python docs on how to get the offset ranges for a KafkaRDD

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11335][STREAMING] update kafka direct p...

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/9289#issuecomment-155915593
  
    LGTM. Merging this to master and 1.6. Thanks! :)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11335][STREAMING] update kafka direct p...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/9289


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11335][STREAMING] update kafka direct p...

Posted by manygrams <gi...@git.apache.org>.
Github user manygrams commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9289#discussion_r43131969
  
    --- Diff: docs/streaming-kafka-integration.md ---
    @@ -181,7 +181,20 @@ Next, we discuss how to use this approach in your streaming application.
     		);
     	</div>
     	<div data-lang="python" markdown="1">
    -		Not supported yet
    +		offsetRanges = []
    +
    +		def storeOffsetRanges(rdd):
    +		    del offsetRanges[:]
    +		    offsetRanges.extend(rdd.offsetRanges())
    --- End diff --
    
    I tried that and couldn't get it working - seems to be related to a new object being created instead of updating the old one.
    
    However, if I add `global offsetRanges` before, that seems to work fine. I'll push that change out.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11335][STREAMING] update kafka direct p...

Posted by manygrams <gi...@git.apache.org>.
Github user manygrams commented on the pull request:

    https://github.com/apache/spark/pull/9289#issuecomment-155560594
  
    hey @tdas any new thoughts on this? sorry, forgot to notify you that I had made your suggested changes


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11335][STREAMING] update kafka direct p...

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9289#discussion_r43094153
  
    --- Diff: docs/streaming-kafka-integration.md ---
    @@ -181,7 +181,20 @@ Next, we discuss how to use this approach in your streaming application.
     		);
     	</div>
     	<div data-lang="python" markdown="1">
    -		Not supported yet
    +		offsetRanges = []
    +
    +		def storeOffsetRanges(rdd):
    +		    del offsetRanges[:]
    +		    offsetRanges.extend(rdd.offsetRanges())
    --- End diff --
    
    cant we simply use `offsetRanges = rdd.offsetRanges()`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11335][STREAMING] update kafka direct p...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9289#issuecomment-151365988
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org