You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Ruoyun Huang (Jira)" <ji...@apache.org> on 2019/11/13 00:45:00 UTC
[jira] [Commented] (BEAM-8645) TimestampCombiner incorrect in beam
python
[ https://issues.apache.org/jira/browse/BEAM-8645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16972921#comment-16972921 ]
Ruoyun Huang commented on BEAM-8645:
------------------------------------
Created PR to show how this happens: [https://github.com/apache/beam/pull/10081]
> TimestampCombiner incorrect in beam python
> ------------------------------------------
>
> Key: BEAM-8645
> URL: https://issues.apache.org/jira/browse/BEAM-8645
> Project: Beam
> Issue Type: Bug
> Components: sdk-py-core
> Reporter: Ruoyun Huang
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> When we have a TimestampValue on combine:
> {code:java}
> main_stream = (p
> | 'main TestStream' >> TestStream() .add_elements([window.TimestampedValue(('k', 100), 0)]) .add_elements([window.TimestampedValue(('k', 400), 9)]) .advance_watermark_to_infinity()
> | 'main windowInto' >> beam.WindowInto( window.FixedWindows(10), timestamp_combiner=TimestampCombiner.OUTPUT_AT_LATEST) | 'Combine' >> beam.CombinePerKey(sum))
> The expect timestamp should be:
> LATEST: (('k', 500), Timestamp(9)),
> EARLIEST: (('k', 500), Timestamp(0)),
> END_OF_WINDOW: (('k', 500), Timestamp(10)),
> But current py streaming gives following results:
> LATEST: (('k', 500), Timestamp(10)),
> EARLIEST: (('k', 500), Timestamp(10)),
> END_OF_WINDOW: (('k', 500), Timestamp(9.99999999)),
> More details and discussions:
> https://lists.apache.org/thread.html/d3af1f2f84a2e59a747196039eae77812b78a991f0f293c717e5f4e1@%3Cdev.beam.apache.org%3E
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)