You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by karthikjay <as...@gmail.com> on 2018/05/10 00:05:02 UTC

[Structured-Streaming][Beginner] Out of order messages with Spark kafka readstream from a specific partition

On the producer side, I make sure data for a specific user lands on the same
partition. On the consumer side, I use a regular Spark kafka readstream and
read the data. I also use a console write stream to print out the spark
kafka DataFrame. What I observer is, the data for a specific user (even
though in the same partition) arrives out of order in the console. 

I also verified the data ordering by running a simple Kafka consumer in Java
and the data seems to be ordered. What am I missing here ?

Thanks,
JK



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: [Structured-Streaming][Beginner] Out of order messages with Spark kafka readstream from a specific partition

Posted by Cody Koeninger <co...@koeninger.org>.
As long as you aren't doing any spark operations that involve a
shuffle, the order you see in spark should be the same as the order in
the partition.

Can you link to a minimal code example that reproduces the issue?

On Wed, May 9, 2018 at 7:05 PM, karthikjay <as...@gmail.com> wrote:
> On the producer side, I make sure data for a specific user lands on the same
> partition. On the consumer side, I use a regular Spark kafka readstream and
> read the data. I also use a console write stream to print out the spark
> kafka DataFrame. What I observer is, the data for a specific user (even
> though in the same partition) arrives out of order in the console.
>
> I also verified the data ordering by running a simple Kafka consumer in Java
> and the data seems to be ordered. What am I missing here ?
>
> Thanks,
> JK
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org