You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2020/04/21 10:51:38 UTC

[GitHub] [beam] Ardagan opened a new pull request #11477: Add PeriodicSequence generator.

Ardagan opened a new pull request #11477:
URL: https://github.com/apache/beam/pull/11477


   Add java snippet for slowly updating side inputs.
   
   **Please** add a meaningful description for your change here
   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`).
    - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Cron/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/)
   XLang | --- | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/)
   
   Pre-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   --- |Java | Python | Go | Website
   --- | --- | --- | --- | ---
   Non-portable | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/) 
   Portable | --- | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/) | --- | ---
   
   See [.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md) for trigger phrase, status and link of all Jenkins jobs.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] Ardagan removed a comment on pull request #11477: [BEAM-9650] Add PeriodicSequence generator.

Posted by GitBox <gi...@apache.org>.
Ardagan removed a comment on pull request #11477:
URL: https://github.com/apache/beam/pull/11477#issuecomment-620700722


   run java precommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on a change in pull request #11477: [BEAM-9650] Add PeriodicSequence generator.

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on a change in pull request #11477:
URL: https://github.com/apache/beam/pull/11477#discussion_r414851061



##########
File path: sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/PeriodicImpulseTest.java
##########
@@ -0,0 +1,79 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.transforms;
+
+import java.util.ArrayList;
+import org.apache.beam.sdk.testing.NeedsRunner;
+import org.apache.beam.sdk.testing.PAssert;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.apache.beam.sdk.testing.UsesImpulse;
+import org.apache.beam.sdk.testing.UsesStatefulParDo;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PCollection;
+import org.joda.time.Duration;
+import org.joda.time.Instant;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/** Tests for PeriodicImpulse. */
+@RunWith(JUnit4.class)
+public class PeriodicImpulseTest {
+  @Rule public transient TestPipeline p = TestPipeline.create();
+
+  public static class ExtractTsDoFn<InputT> extends DoFn<InputT, KV<InputT, Instant>> {
+    @ProcessElement
+    public void processElement(DoFn<InputT, KV<InputT, Instant>>.ProcessContext c)
+        throws Exception {
+      c.output(KV.of(c.element(), c.timestamp()));
+    }
+  }
+
+  @Test
+  @Category({
+    NeedsRunner.class,
+    UsesImpulse.class,
+    UsesStatefulParDo.class,
+  })
+  public void testOutputsProperElements() {
+    Instant instant = Instant.now();

Review comment:
       On the DirectRunner at least you could use [DateTimeUtils#setCurrentMillisFixed](https://www.joda.org/joda-time/apidocs/org/joda/time/DateTimeUtils.html#setCurrentMillisFixed-long-)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] Ardagan commented on a change in pull request #11477: [BEAM-9650] Add PeriodicSequence generator.

Posted by GitBox <gi...@apache.org>.
Ardagan commented on a change in pull request #11477:
URL: https://github.com/apache/beam/pull/11477#discussion_r414855462



##########
File path: sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/PeriodicImpulseTest.java
##########
@@ -0,0 +1,79 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.transforms;
+
+import java.util.ArrayList;
+import org.apache.beam.sdk.testing.NeedsRunner;
+import org.apache.beam.sdk.testing.PAssert;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.apache.beam.sdk.testing.UsesImpulse;
+import org.apache.beam.sdk.testing.UsesStatefulParDo;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PCollection;
+import org.joda.time.Duration;
+import org.joda.time.Instant;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/** Tests for PeriodicImpulse. */
+@RunWith(JUnit4.class)
+public class PeriodicImpulseTest {
+  @Rule public transient TestPipeline p = TestPipeline.create();
+
+  public static class ExtractTsDoFn<InputT> extends DoFn<InputT, KV<InputT, Instant>> {
+    @ProcessElement
+    public void processElement(DoFn<InputT, KV<InputT, Instant>>.ProcessContext c)
+        throws Exception {
+      c.output(KV.of(c.element(), c.timestamp()));
+    }
+  }
+
+  @Test
+  @Category({
+    NeedsRunner.class,
+    UsesImpulse.class,
+    UsesStatefulParDo.class,
+  })
+  public void testOutputsProperElements() {
+    Instant instant = Instant.now();
+
+    Instant startTime = instant.minus(Duration.standardHours(100));
+    long duration = 500;
+    Duration interval = Duration.millis(250);
+    long intervalMillis = interval.getMillis();
+    Instant stopTime = startTime.plus(duration);
+
+    PCollection<KV<Instant, Instant>> result =
+        p.apply(PeriodicImpulse.create().startAt(startTime).stopAt(stopTime).withInterval(interval))
+            .apply(ParDo.of(new ExtractTsDoFn<>()));
+
+    ArrayList<KV<Instant, Instant>> expectedResults =
+        new ArrayList<>((int) (duration / intervalMillis + 1));
+    for (long i = 0; i <= duration; i += intervalMillis) {
+      Instant el = startTime.plus(i);
+      expectedResults.add(KV.of(el, el));
+    }
+
+    PAssert.that(result).containsInAnyOrder(expectedResults);
+
+    p.run().waitUntilFinish();
+  }

Review comment:
       This is a valid comment. The problem is that to generate reliable test, we need to mock time on many layers. I didn't find a reliable way to do this. This test is a balance that tests most of functionality. Delaying elements to the future will make tests unreasonably long and will most likely be unreliable.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] Ardagan commented on pull request #11477: [BEAM-9650] Add PeriodicSequence generator.

Posted by GitBox <gi...@apache.org>.
Ardagan commented on pull request #11477:
URL: https://github.com/apache/beam/pull/11477#issuecomment-623545312






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] Ardagan commented on pull request #11477: [BEAM-9650] Add PeriodicSequence generator.

Posted by GitBox <gi...@apache.org>.
Ardagan commented on pull request #11477:
URL: https://github.com/apache/beam/pull/11477#issuecomment-620700722


   run java precommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] Ardagan commented on a change in pull request #11477: [BEAM-9650] Add PeriodicSequence generator.

Posted by GitBox <gi...@apache.org>.
Ardagan commented on a change in pull request #11477:
URL: https://github.com/apache/beam/pull/11477#discussion_r414854690



##########
File path: sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/PeriodicImpulseTest.java
##########
@@ -0,0 +1,79 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.transforms;
+
+import java.util.ArrayList;
+import org.apache.beam.sdk.testing.NeedsRunner;
+import org.apache.beam.sdk.testing.PAssert;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.apache.beam.sdk.testing.UsesImpulse;
+import org.apache.beam.sdk.testing.UsesStatefulParDo;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PCollection;
+import org.joda.time.Duration;
+import org.joda.time.Instant;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/** Tests for PeriodicImpulse. */
+@RunWith(JUnit4.class)
+public class PeriodicImpulseTest {
+  @Rule public transient TestPipeline p = TestPipeline.create();
+
+  public static class ExtractTsDoFn<InputT> extends DoFn<InputT, KV<InputT, Instant>> {
+    @ProcessElement
+    public void processElement(DoFn<InputT, KV<InputT, Instant>>.ProcessContext c)
+        throws Exception {
+      c.output(KV.of(c.element(), c.timestamp()));
+    }
+  }
+
+  @Test
+  @Category({
+    NeedsRunner.class,
+    UsesImpulse.class,
+    UsesStatefulParDo.class,
+  })
+  public void testOutputsProperElements() {
+    Instant instant = Instant.now();

Review comment:
       I was looking to do this. Mostly looking towards TestStream, but the issue is that there's no way to get runner time within process method and I have to rely on now() to figure out how many elements I can output.
   Current test is a balance that still verifies functionality, but doesn't make test run too long.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on a change in pull request #11477: [BEAM-9650] Add PeriodicSequence generator.

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on a change in pull request #11477:
URL: https://github.com/apache/beam/pull/11477#discussion_r414822214



##########
File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/PeriodicSequence.java
##########
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.transforms;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull;
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkState;
+
+import java.util.List;
+import javax.annotation.Nullable;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.io.range.OffsetRange;
+import org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker;
+import org.apache.beam.sdk.transforms.splittabledofn.Sizes;
+import org.apache.beam.sdk.transforms.splittabledofn.SplitResult;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.MoreObjects;
+import org.joda.time.Duration;
+import org.joda.time.Instant;
+
+/**
+ * A {@link PTransform} which generates a sequence of timestamped elements at given interval in
+ * runtime.
+ *
+ * <p>Receives a PCollection<List<Long>> where each element triggers the generation of sequence and
+ * has following elements: 0: first element timestamp 1: last element timestamp 2: interval
+ *
+ * <p>All elements that have timestamp in the past will be output right away. Elements that have
+ * timestamp in the future will be delayed.

Review comment:
       This should be clear that past and future are determined by the system clock on the worker machine (can we just call that "processing time"?)

##########
File path: sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/PeriodicImpulseTest.java
##########
@@ -0,0 +1,79 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.transforms;
+
+import java.util.ArrayList;
+import org.apache.beam.sdk.testing.NeedsRunner;
+import org.apache.beam.sdk.testing.PAssert;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.apache.beam.sdk.testing.UsesImpulse;
+import org.apache.beam.sdk.testing.UsesStatefulParDo;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PCollection;
+import org.joda.time.Duration;
+import org.joda.time.Instant;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/** Tests for PeriodicImpulse. */
+@RunWith(JUnit4.class)
+public class PeriodicImpulseTest {
+  @Rule public transient TestPipeline p = TestPipeline.create();
+
+  public static class ExtractTsDoFn<InputT> extends DoFn<InputT, KV<InputT, Instant>> {
+    @ProcessElement
+    public void processElement(DoFn<InputT, KV<InputT, Instant>>.ProcessContext c)
+        throws Exception {
+      c.output(KV.of(c.element(), c.timestamp()));
+    }
+  }
+
+  @Test
+  @Category({
+    NeedsRunner.class,
+    UsesImpulse.class,
+    UsesStatefulParDo.class,
+  })
+  public void testOutputsProperElements() {
+    Instant instant = Instant.now();
+
+    Instant startTime = instant.minus(Duration.standardHours(100));
+    long duration = 500;
+    Duration interval = Duration.millis(250);
+    long intervalMillis = interval.getMillis();
+    Instant stopTime = startTime.plus(duration);
+
+    PCollection<KV<Instant, Instant>> result =
+        p.apply(PeriodicImpulse.create().startAt(startTime).stopAt(stopTime).withInterval(interval))
+            .apply(ParDo.of(new ExtractTsDoFn<>()));
+
+    ArrayList<KV<Instant, Instant>> expectedResults =
+        new ArrayList<>((int) (duration / intervalMillis + 1));
+    for (long i = 0; i <= duration; i += intervalMillis) {
+      Instant el = startTime.plus(i);
+      expectedResults.add(KV.of(el, el));
+    }
+
+    PAssert.that(result).containsInAnyOrder(expectedResults);
+
+    p.run().waitUntilFinish();
+  }

Review comment:
       It would be good to add a test to verify `PeriodicImpulse` delays elements with timestamps in the future. One way could be to just set a startTime far in the future and verify nothing is output after some delay. But then the delay would have to be long enough to make sure the pipeline actually started for every runner, which isn't ideal.

##########
File path: sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/PeriodicImpulseTest.java
##########
@@ -0,0 +1,79 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.transforms;
+
+import java.util.ArrayList;
+import org.apache.beam.sdk.testing.NeedsRunner;
+import org.apache.beam.sdk.testing.PAssert;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.apache.beam.sdk.testing.UsesImpulse;
+import org.apache.beam.sdk.testing.UsesStatefulParDo;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PCollection;
+import org.joda.time.Duration;
+import org.joda.time.Instant;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/** Tests for PeriodicImpulse. */
+@RunWith(JUnit4.class)
+public class PeriodicImpulseTest {
+  @Rule public transient TestPipeline p = TestPipeline.create();
+
+  public static class ExtractTsDoFn<InputT> extends DoFn<InputT, KV<InputT, Instant>> {
+    @ProcessElement
+    public void processElement(DoFn<InputT, KV<InputT, Instant>>.ProcessContext c)
+        throws Exception {
+      c.output(KV.of(c.element(), c.timestamp()));
+    }
+  }
+
+  @Test
+  @Category({
+    NeedsRunner.class,
+    UsesImpulse.class,
+    UsesStatefulParDo.class,
+  })
+  public void testOutputsProperElements() {
+    Instant instant = Instant.now();

Review comment:
       It would be a lot easier to verify many different situations if there were some way to mock the clock, but that's pretty challenging when the clock might be on a remote worker. Have we solved that problem anywhere else?

##########
File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/PeriodicSequence.java
##########
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.transforms;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull;
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkState;
+
+import java.util.List;
+import javax.annotation.Nullable;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.io.range.OffsetRange;
+import org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker;
+import org.apache.beam.sdk.transforms.splittabledofn.Sizes;
+import org.apache.beam.sdk.transforms.splittabledofn.SplitResult;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.MoreObjects;
+import org.joda.time.Duration;
+import org.joda.time.Instant;
+
+/**
+ * A {@link PTransform} which generates a sequence of timestamped elements at given interval in
+ * runtime.
+ *
+ * <p>Receives a PCollection<List<Long>> where each element triggers the generation of sequence and
+ * has following elements: 0: first element timestamp 1: last element timestamp 2: interval

Review comment:
       Does the element type need to be a list? What about defining a type for it like:
   ```
   class SequenceDefinition {
     Instant first;
     Instant last;
     Duration interval;
     ...
   }
   ```

##########
File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/PeriodicSequence.java
##########
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.transforms;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull;
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkState;
+
+import java.util.List;
+import javax.annotation.Nullable;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.io.range.OffsetRange;
+import org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker;
+import org.apache.beam.sdk.transforms.splittabledofn.Sizes;
+import org.apache.beam.sdk.transforms.splittabledofn.SplitResult;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.MoreObjects;
+import org.joda.time.Duration;
+import org.joda.time.Instant;
+
+/**
+ * A {@link PTransform} which generates a sequence of timestamped elements at given interval in
+ * runtime.
+ *
+ * <p>Receives a PCollection<List<Long>> where each element triggers the generation of sequence and
+ * has following elements: 0: first element timestamp 1: last element timestamp 2: interval
+ *
+ * <p>All elements that have timestamp in the past will be output right away. Elements that have
+ * timestamp in the future will be delayed.
+ *
+ * <p>Transform will not output elements prior to target timestamp. Transform can output elements at
+ * any time after target timestamp.

Review comment:
       I'm not sure I understand what this means, could you clarify? It looks like maybe it's re-stating the previous paragraph in a different way?

##########
File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/PeriodicSequence.java
##########
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.transforms;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull;
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkState;
+
+import java.util.List;
+import javax.annotation.Nullable;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.io.range.OffsetRange;
+import org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker;
+import org.apache.beam.sdk.transforms.splittabledofn.Sizes;
+import org.apache.beam.sdk.transforms.splittabledofn.SplitResult;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.MoreObjects;
+import org.joda.time.Duration;
+import org.joda.time.Instant;
+
+/**
+ * A {@link PTransform} which generates a sequence of timestamped elements at given interval in
+ * runtime.
+ *
+ * <p>Receives a PCollection<List<Long>> where each element triggers the generation of sequence and
+ * has following elements: 0: first element timestamp 1: last element timestamp 2: interval
+ *
+ * <p>All elements that have timestamp in the past will be output right away. Elements that have
+ * timestamp in the future will be delayed.
+ *
+ * <p>Transform will not output elements prior to target timestamp. Transform can output elements at
+ * any time after target timestamp.
+ */
+@Experimental(Experimental.Kind.SPLITTABLE_DO_FN)
+public class PeriodicSequence extends PTransform<PCollection<List<Long>>, PCollection<Instant>> {

Review comment:
       Does this need to be public? If so, should it have a test as well?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on a change in pull request #11477: [BEAM-9650] Add PeriodicSequence generator.

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on a change in pull request #11477:
URL: https://github.com/apache/beam/pull/11477#discussion_r417656677



##########
File path: sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/PeriodicSequenceTest.java
##########
@@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.transforms;
+
+import java.util.ArrayList;
+import org.apache.beam.sdk.testing.NeedsRunner;
+import org.apache.beam.sdk.testing.PAssert;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.apache.beam.sdk.testing.UsesImpulse;
+import org.apache.beam.sdk.testing.UsesStatefulParDo;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PCollection;
+import org.joda.time.Duration;
+import org.joda.time.Instant;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/** Tests for PeriodicSequence. */
+@RunWith(JUnit4.class)
+public class PeriodicSequenceTest {
+  @Rule public transient TestPipeline p = TestPipeline.create();
+
+  public static class ExtractTsDoFn<InputT> extends DoFn<InputT, KV<InputT, Instant>> {
+    @ProcessElement
+    public void processElement(DoFn<InputT, KV<InputT, Instant>>.ProcessContext c)
+        throws Exception {
+      c.output(KV.of(c.element(), c.timestamp()));
+    }
+  }
+
+  @Test
+  @Category({
+    NeedsRunner.class,
+    UsesImpulse.class,
+    UsesStatefulParDo.class,
+  })
+  public void testOutputsProperElements() {
+    Instant instant = Instant.now();
+
+    Instant startTime = instant.minus(Duration.standardHours(100));
+    long duration = 500;
+    Duration interval = Duration.millis(250);
+    long intervalMillis = interval.getMillis();
+    Instant stopTime = startTime.plus(duration);
+
+    PCollection<KV<Instant, Instant>> result =
+        p.apply(
+                Create.<PeriodicSequence.SequenceDefinition>of(
+                    new PeriodicSequence.SequenceDefinition(startTime, stopTime, interval)))
+            .apply(PeriodicSequence.create())
+            .apply(ParDo.of(new ExtractTsDoFn<>()));

Review comment:
       Can you add a comment here that the reason you put the Instant into a KV is so you can verify the timestamp? It took me a while to realize that. 
   (Also if we don't already have some general purpose way to verify timestamps in PCollections maybe we should?)

##########
File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/PeriodicSequence.java
##########
@@ -21,33 +21,69 @@
 import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull;
 import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkState;
 
-import java.util.List;
+import java.util.Objects;
 import javax.annotation.Nullable;
 import org.apache.beam.sdk.annotations.Experimental;
 import org.apache.beam.sdk.io.range.OffsetRange;
+import org.apache.beam.sdk.schemas.JavaFieldSchema;
+import org.apache.beam.sdk.schemas.annotations.DefaultSchema;
 import org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker;
-import org.apache.beam.sdk.transforms.splittabledofn.Sizes;
 import org.apache.beam.sdk.transforms.splittabledofn.SplitResult;
 import org.apache.beam.sdk.values.PCollection;
 import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.MoreObjects;
 import org.joda.time.Duration;
 import org.joda.time.Instant;
 
 /**
- * A {@link PTransform} which generates a sequence of timestamped elements at given interval in
- * runtime.
+ * A {@link PTransform} which generates a sequence of timestamped elements at given runtime
+ * interval.
  *
- * <p>Receives a PCollection<List<Long>> where each element triggers the generation of sequence and
- * has following elements: 0: first element timestamp 1: last element timestamp 2: interval
+ * <p>Transform will not output elements prior to target time. Transform can output elements at any
+ * time after target time.
  *
- * <p>All elements that have timestamp in the past will be output right away. Elements that have
- * timestamp in the future will be delayed.
- *
- * <p>Transform will not output elements prior to target timestamp. Transform can output elements at
- * any time after target timestamp.
+ * <p>Multiple elements can be output at given moment if their timestamp is earlier than current
+ * time.
  */
 @Experimental(Experimental.Kind.SPLITTABLE_DO_FN)
-public class PeriodicSequence extends PTransform<PCollection<List<Long>>, PCollection<Instant>> {
+public class PeriodicSequence
+    extends PTransform<PCollection<PeriodicSequence.SequenceDefinition>, PCollection<Instant>> {
+
+  @DefaultSchema(JavaFieldSchema.class)
+  public static class SequenceDefinition {
+    public Instant first;
+    public Instant last;
+    public Long durationMilliSec;
+
+    public SequenceDefinition() {}
+
+    public SequenceDefinition(Instant first, Instant last, Duration duration) {
+      this.first = first;
+      this.last = last;
+      this.durationMilliSec = duration.getMillis();
+    }
+
+    @Override
+    public boolean equals(Object obj) {
+      if (this == obj) {
+        return true;
+      }
+
+      if (obj == null || obj.getClass() != this.getClass()) {
+        return false;
+      }
+
+      SequenceDefinition src = (SequenceDefinition) obj;
+      return src.first.equals(this.first)
+          && src.last.equals(this.last)
+          && src.durationMilliSec.equals(this.durationMilliSec);
+    }
+
+    @Override
+    public int hashCode() {
+      int result = Objects.hash(first, last, durationMilliSec);
+      return result;
+    }

Review comment:
       nit: Consider using `AutoValue` (and `AutoValueSchemaProvider` if you use schemas) so you don't have to implement these. Up to you though.

##########
File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/PeriodicSequence.java
##########
@@ -21,33 +21,69 @@
 import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull;
 import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkState;
 
-import java.util.List;
+import java.util.Objects;
 import javax.annotation.Nullable;
 import org.apache.beam.sdk.annotations.Experimental;
 import org.apache.beam.sdk.io.range.OffsetRange;
+import org.apache.beam.sdk.schemas.JavaFieldSchema;
+import org.apache.beam.sdk.schemas.annotations.DefaultSchema;
 import org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker;
-import org.apache.beam.sdk.transforms.splittabledofn.Sizes;
 import org.apache.beam.sdk.transforms.splittabledofn.SplitResult;
 import org.apache.beam.sdk.values.PCollection;
 import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.MoreObjects;
 import org.joda.time.Duration;
 import org.joda.time.Instant;
 
 /**
- * A {@link PTransform} which generates a sequence of timestamped elements at given interval in
- * runtime.
+ * A {@link PTransform} which generates a sequence of timestamped elements at given runtime
+ * interval.
  *
- * <p>Receives a PCollection<List<Long>> where each element triggers the generation of sequence and
- * has following elements: 0: first element timestamp 1: last element timestamp 2: interval
+ * <p>Transform will not output elements prior to target time. Transform can output elements at any
+ * time after target time.
  *
- * <p>All elements that have timestamp in the past will be output right away. Elements that have
- * timestamp in the future will be delayed.
- *
- * <p>Transform will not output elements prior to target timestamp. Transform can output elements at
- * any time after target timestamp.
+ * <p>Multiple elements can be output at given moment if their timestamp is earlier than current
+ * time.
  */
 @Experimental(Experimental.Kind.SPLITTABLE_DO_FN)
-public class PeriodicSequence extends PTransform<PCollection<List<Long>>, PCollection<Instant>> {
+public class PeriodicSequence
+    extends PTransform<PCollection<PeriodicSequence.SequenceDefinition>, PCollection<Instant>> {
+
+  @DefaultSchema(JavaFieldSchema.class)
+  public static class SequenceDefinition {
+    public Instant first;
+    public Instant last;
+    public Long durationMilliSec;

Review comment:
       We really should have support for `Duration` in beam schemas... filed [BEAM-9859](https://issues.apache.org/jira/browse/BEAM-9859) for this.
   
   In the meantime, couldn't you just use `@DefaultCoder(SerializableCoder.class)` so you could store this as a Duration?

##########
File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/PeriodicSequence.java
##########
@@ -0,0 +1,212 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.transforms;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull;
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkState;
+
+import java.util.Objects;
+import javax.annotation.Nullable;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.io.range.OffsetRange;
+import org.apache.beam.sdk.schemas.JavaFieldSchema;
+import org.apache.beam.sdk.schemas.annotations.DefaultSchema;
+import org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker;
+import org.apache.beam.sdk.transforms.splittabledofn.SplitResult;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.MoreObjects;
+import org.joda.time.Duration;
+import org.joda.time.Instant;
+
+/**
+ * A {@link PTransform} which generates a sequence of timestamped elements at given runtime
+ * interval.
+ *
+ * <p>Transform will not output elements prior to target time. Transform can output elements at any
+ * time after target time.

Review comment:
       Can you add a note that this is according to the clock on the worker node?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] Ardagan commented on a change in pull request #11477: [BEAM-9650] Add PeriodicSequence generator.

Posted by GitBox <gi...@apache.org>.
Ardagan commented on a change in pull request #11477:
URL: https://github.com/apache/beam/pull/11477#discussion_r418673674



##########
File path: sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/PeriodicSequence.java
##########
@@ -21,33 +21,69 @@
 import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull;
 import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkState;
 
-import java.util.List;
+import java.util.Objects;
 import javax.annotation.Nullable;
 import org.apache.beam.sdk.annotations.Experimental;
 import org.apache.beam.sdk.io.range.OffsetRange;
+import org.apache.beam.sdk.schemas.JavaFieldSchema;
+import org.apache.beam.sdk.schemas.annotations.DefaultSchema;
 import org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker;
-import org.apache.beam.sdk.transforms.splittabledofn.Sizes;
 import org.apache.beam.sdk.transforms.splittabledofn.SplitResult;
 import org.apache.beam.sdk.values.PCollection;
 import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.MoreObjects;
 import org.joda.time.Duration;
 import org.joda.time.Instant;
 
 /**
- * A {@link PTransform} which generates a sequence of timestamped elements at given interval in
- * runtime.
+ * A {@link PTransform} which generates a sequence of timestamped elements at given runtime
+ * interval.
  *
- * <p>Receives a PCollection<List<Long>> where each element triggers the generation of sequence and
- * has following elements: 0: first element timestamp 1: last element timestamp 2: interval
+ * <p>Transform will not output elements prior to target time. Transform can output elements at any
+ * time after target time.
  *
- * <p>All elements that have timestamp in the past will be output right away. Elements that have
- * timestamp in the future will be delayed.
- *
- * <p>Transform will not output elements prior to target timestamp. Transform can output elements at
- * any time after target timestamp.
+ * <p>Multiple elements can be output at given moment if their timestamp is earlier than current
+ * time.
  */
 @Experimental(Experimental.Kind.SPLITTABLE_DO_FN)
-public class PeriodicSequence extends PTransform<PCollection<List<Long>>, PCollection<Instant>> {
+public class PeriodicSequence
+    extends PTransform<PCollection<PeriodicSequence.SequenceDefinition>, PCollection<Instant>> {
+
+  @DefaultSchema(JavaFieldSchema.class)
+  public static class SequenceDefinition {
+    public Instant first;
+    public Instant last;
+    public Long durationMilliSec;
+
+    public SequenceDefinition() {}
+
+    public SequenceDefinition(Instant first, Instant last, Duration duration) {
+      this.first = first;
+      this.last = last;
+      this.durationMilliSec = duration.getMillis();
+    }
+
+    @Override
+    public boolean equals(Object obj) {
+      if (this == obj) {
+        return true;
+      }
+
+      if (obj == null || obj.getClass() != this.getClass()) {
+        return false;
+      }
+
+      SequenceDefinition src = (SequenceDefinition) obj;
+      return src.first.equals(this.first)
+          && src.last.equals(this.last)
+          && src.durationMilliSec.equals(this.durationMilliSec);
+    }
+
+    @Override
+    public int hashCode() {
+      int result = Objects.hash(first, last, durationMilliSec);
+      return result;
+    }

Review comment:
       I tried to use it, but didn't manage to get AutoValue work. It failed to properly detect constructor for generated class. I tried to debug it, but it would be better to handle in separate PR.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on a change in pull request #11477: [BEAM-9650] Add PeriodicSequence generator.

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on a change in pull request #11477:
URL: https://github.com/apache/beam/pull/11477#discussion_r418724351



##########
File path: examples/java/src/main/java/org/apache/beam/examples/snippets/Snippets.java
##########
@@ -785,4 +787,73 @@ public static void main(String[] args) {
 
     }
   }
+
+  public static class PeriodicallyUpdatingSideInputs {
+
+    public static PCollection<Long> main(
+        Pipeline p,
+        Instant startAt,
+        Instant stopAt,
+        Duration interval1,
+        Duration interval2,
+        String fileToRead) {
+      // [START PeriodicallyUpdatingSideInputs]
+      PCollectionView<List<Long>> sideInput =
+          p.apply(
+                  "SIImpulse",
+                  PeriodicImpulse.create()
+                      .startAt(startAt)
+                      .stopAt(stopAt)
+                      .withInterval(interval1)
+                      .applyWindowing())
+              .apply(
+                  "FileToRead",
+                  ParDo.of(
+                      new DoFn<Instant, String>() {
+                        @DoFn.ProcessElement
+                        public void process(@Element Instant notUsed, OutputReceiver<String> o) {
+                          o.output(fileToRead);
+                        }
+                      }))
+              .apply(FileIO.matchAll())
+              .apply(FileIO.readMatches())
+              .apply(TextIO.readFiles())
+              .apply(
+                  ParDo.of(
+                      new DoFn<String, String>() {
+                        @ProcessElement
+                        public void process(@Element String src, OutputReceiver<String> o) {
+                          System.out.println(src);

Review comment:
       Sorry just noticed this. Can you replace it with a log statement?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on pull request #11477: Add PeriodicSequence generator.

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on pull request #11477:
URL: https://github.com/apache/beam/pull/11477#issuecomment-618672920


   Is there a jira for this?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] Ardagan removed a comment on pull request #11477: [BEAM-9650] Add PeriodicSequence generator.

Posted by GitBox <gi...@apache.org>.
Ardagan removed a comment on pull request #11477:
URL: https://github.com/apache/beam/pull/11477#issuecomment-620876614


   Run Java PreCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] Ardagan commented on pull request #11477: [BEAM-9650] Add PeriodicSequence generator.

Posted by GitBox <gi...@apache.org>.
Ardagan commented on pull request #11477:
URL: https://github.com/apache/beam/pull/11477#issuecomment-620876614


   Run Java PreCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] Ardagan commented on pull request #11477: [BEAM-9650] Add PeriodicSequence generator.

Posted by GitBox <gi...@apache.org>.
Ardagan commented on pull request #11477:
URL: https://github.com/apache/beam/pull/11477#issuecomment-624198780


   Run Website_Stage_GCS PreCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org