You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by Leonardo Miguel <le...@arquivei.com.br> on 2019/01/22 15:52:54 UTC

PubsubIO: watermark on DirectRunerr

Hi!

I'm using Scio 0.7.0 to run the following code:

sc
>   .withName("read from pubsub")
>   .pubsubSubscription[PubsubMessage]("projects/arquivei-curso-dataflow/subscriptions/curso-dataflow")
>   .withName("log message received and decode").map { message =>
>     val messageContent = new String(message.getPayload, "UTF-8")
>     Ex03.logger.info(s"message received: " + messageContent.toString)
>     messageContent
>   }
>   .withName("window por segundos").withFixedWindows(
>     Duration.standardSeconds(windowDuration),
>     options = WindowOptions(
>       allowedLateness = Duration.ZERO,
>       trigger = AfterWatermark.pastEndOfWindow(),
>       accumulationMode = AccumulationMode.DISCARDING_FIRED_PANES
>     )
>   )
>   .withName("split words").flatMap { line =>
>     Ex03.logger.info(s"splitting words for $line")
>     line.split(" ")
>   }
>   .withName("group words").groupBy { word =>
>     Ex03.logger.info(s"group word=$word")
>     word
>   }
>   .withName("count words").map { result =>
>     val word = result._1
>     val count = result._2.size
>     Ex03.logger.info(s"$word showed up $count times in past 10 seconds")
>   }
>
>
It's a simple example for a streaming wordcount, with fixed windows of 10
seconds.
I would like to count words for each 10 seconds window.

When using Dataflow runner, code works as expected.
But when using Direct Runner, code never reaches "count words" step no
matter how much time pipeline is running. I get all logs before the result
ones ("word showed up X times...").
It seems watermark never reaches the 10 seconds window on Direct Runner,
and panes are never fired.
I tried debugging event times and it is okay. Pubsub sets correctly time
for each event I'm sending.

Is there a workaround for this?

-- 
[]s

Leonardo Alves Miguel
Data Engineer
(16) 3509-5515 | www.arquivei.com.br
<https://arquivei.com.br/?utm_campaign=assinatura-email&utm_content=assinatura>
[image: Arquivei.com.br – Inteligência em Notas Fiscais]
<https://arquivei.com.br/?utm_campaign=assinatura-email&utm_content=assinatura>
[image: Google seleciona Arquivei para imersão e mentoria no Vale do
Silício]
<https://arquivei.com.br/blog/google-seleciona-arquivei/?utm_campaign=assinatura-email-launchpad&utm_content=assinatura-launchpad>
<https://www.facebook.com/arquivei>
<https://www.linkedin.com/company/arquivei>
<https://www.youtube.com/watch?v=KJFrh8h4Zds&yt%3Acc=on>