You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2018/08/08 02:15:00 UTC

[jira] [Resolved] (SPARK-25052) Is there any possibility that spark structured streaming generate duplicates in the output?

     [ https://issues.apache.org/jira/browse/SPARK-25052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon resolved SPARK-25052.
----------------------------------
    Resolution: Invalid

Questions should better go to mailing list, https://spark.apache.org/community.html. Let's better file an issue when it's clear if this is an issue.

> Is there any possibility that spark structured streaming generate duplicates in the output?
> -------------------------------------------------------------------------------------------
>
>                 Key: SPARK-25052
>                 URL: https://issues.apache.org/jira/browse/SPARK-25052
>             Project: Spark
>          Issue Type: Question
>          Components: Spark Core
>    Affects Versions: 2.3.0
>            Reporter: bharath kumar avusherla
>            Priority: Minor
>
> We recently observed that the spark structured streaming generated duplicates in the output when reading from Kafka topic and storing the output to the S3 (and checkpointing in S3).  We ran into this issue twice. This is not reproducible. Is there anyone has ever faced this kind of issue before? Is this because of S3 eventual consistency?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org