You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2016/04/06 19:57:25 UTC

[jira] [Assigned] (SPARK-14421) Kinesis deaggregation with PySpark

     [ https://issues.apache.org/jira/browse/SPARK-14421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Apache Spark reassigned SPARK-14421:
------------------------------------

    Assignee: Apache Spark

> Kinesis deaggregation with PySpark
> ----------------------------------
>
>                 Key: SPARK-14421
>                 URL: https://issues.apache.org/jira/browse/SPARK-14421
>             Project: Spark
>          Issue Type: Bug
>    Affects Versions: 1.6.1
>         Environment: PySpark w/ Kinesis word count example
>            Reporter: Brian ONeill
>            Assignee: Apache Spark
>
> I'm creating this issue as a precaution...
> We have some preliminary evidence that indicates that KPL de-aggregation for Kinesis streams may not work in Spark 1.6.1.  Using the PySpark Kinesis Word Count example, we don't receive records when KPL is used to produce the data, with aggregation turned on, using masterUrl = local[16].
> At the same time, I noticed this thread:
> https://forums.aws.amazon.com/message.jspa?messageID=707122
> Following the instructions here:
> http://brianoneill.blogspot.com/2016/03/pyspark-on-amazon-emr-w-kinesis.html
> The example will sometimes work.   When aggregation is disabled, it appears to always work.  I'm going to dig a bit deeper, but thought you might have some pointers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org