You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by kant kodali <ka...@gmail.com> on 2018/05/15 10:05:42 UTC

What to consider when implementing a custom streaming sink?

Hi All,

I am trying to implement a custom sink and I have few questions mainly on
output modes.

1) How does spark let the sink know that a new row is an update of an
existing row? does it look at all the values of all columns of the new row
and an existing row for an equality match or does it compute some sort of
hash?

2) What else do I need to consider when writing a custom sink?

Thanks!