You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@beam.apache.org by Gaurav Nakum <ga...@oracle.com> on 2020/09/10 19:38:54 UTC

Output from Window not getting materialized

Hi everyone!

We are developing a new IO connector using the SDF API, and testing it with the following simple counting pipeline:
 
p.apply(MyIO.read()
        .withStream(inputStream)
        .withStreamPartitions(Arrays.asList(0))
        .withConsumerConfig(config)
    ) // gets a PCollection<KV<String, String>>
 
 
.apply(Values.<String>create()) // PCollection<String>
 
.apply(Window.<String>into(FixedWindows.of(Duration.standardSeconds(10)))
    .withAllowedLateness(Duration.standardDays(1))
    .accumulatingFiredPanes())
 
.apply(Count.<String>perElement()) 
 
 
// write PCollection<KV<String, Long>> to stream
.apply(MyIO.write()
        .withStream(outputStream)
        .withConsumerConfig(config));
 
 

Without the window transform, we can read from the stream and write to it, however, I don’t see output after the Window transform. Could you please help pin down the issue?

Thank you,

Gaurav

Re: Output from Window not getting materialized

Posted by Praveen K Viswanathan <ha...@gmail.com>.

Hello Luke,

Thanks for the reference. The plan is to go with "MonotonicallyIncreasing
Watermark Estimator" but not sure about how to implement it along with our
source which is "Oracle Streaming Service" (OSS). For other sources like
Kafka I can see the availability of "TimeStampPolicy" through which
Watermark in the IO talks natively to Kafka. I looked for something
similar in OSS but did not find any on its SDK. I am planning to check with
the OSS development team on this and would help if you could share what
exactly we would need from the source to implement watermark. If you have
any other code base for implementing a monotonically increasing watermark,
that will also be helpful.

Regards,
Praveen

On Mon, Sep 14, 2020 at 9:10 AM Luke Cwik <lc...@google.com> wrote:

> Is the watermark advancing[1, 2] for the SDF such that the windows can
> close allowing for the Count transform to produce output?
>
> 1: https://www.youtube.com/watch?v=TWxSLmkWPm4
> 2: https://beam.apache.org/documentation/programming-guide/#windowing
>
> On Thu, Sep 10, 2020 at 12:39 PM Gaurav Nakum <ga...@oracle.com>
> wrote:
>
>> Hi everyone!
>>
>> We are developing a new IO connector using the SDF API, and testing it
>> with the following simple counting pipeline:
>>
>>
>>
>> p.apply(MyIO.read()
>>
>>         .withStream(inputStream)
>>
>>         .withStreamPartitions(Arrays.asList(0))
>>
>>         .withConsumerConfig(config)
>>
>>     ) // gets a PCollection<KV<String, String>>
>>
>>
>>
>>
>>
>> .apply(Values.<String>*create*()) // PCollection<String>
>>
>>
>>
>> .apply(Window.<String>into(FixedWindows.of(Duration.standardSeconds(10)))
>>
>>     .withAllowedLateness(Duration.standardDays(1))
>>
>>     .accumulatingFiredPanes())
>>
>>
>>
>> .apply(Count.<String>perElement())
>>
>>
>>
>>
>>
>> // write PCollection<KV<String, Long>> to stream
>>
>> .apply(MyIO.write()
>>
>>         .withStream(outputStream)
>>
>>         .withConsumerConfig(config));
>>
>>
>>
>>
>>
>> Without the window transform, we can read from the stream and write to
>> it, however, I don’t see output after the Window transform. Could you
>> please help pin down the issue?
>>
>> Thank you,
>>
>> Gaurav
>>
>

-- 
Thanks,
Praveen K Viswanathan