You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@streampipes.apache.org by "Grainier Perera (Jira)" <ji...@apache.org> on 2021/01/29 03:41:00 UTC

[jira] [Updated] (STREAMPIPES-292) Introduce event windowing to the StreamPipes core/sdk

     [ https://issues.apache.org/jira/browse/STREAMPIPES-292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Grainier Perera updated STREAMPIPES-292:
----------------------------------------
    Description: 
h3. +*Apache StreamPipes*+

Apache StreamPipes (incubating) is a self-service (Industrial) IoT toolbox to enable non-technical users to connect, analyze and explore IoT data streams. StreamPipes offers several modules including StreamPipes Connect to easily connect data from industrial IoT sources, the Pipeline Editor to quickly create processing pipelines and several visualization modules for live and historic data exploration. Under the hood, StreamPipes utilizes an event-driven microservice paradigm of standalone, so-called analytics microservices making the system easy to extend for individual needs.
h3. +*Background*+

Currently, window logic can be individually defined per pipeline element. The whole windowing logic needs to be declared in the controller and runtime logic needs to be individually added based on the selected runtime wrapper (Java, Siddhi, Flink, etc...).

As many data processors benefit from using window-functions (i.e PEs such as Event Counter, Count Aggregation, Rate Limiter), windowing logic is often duplicated as it needs to be implemented for every new pipeline element. In addition, the feature set of supported window operators differs (and often depends on the developer) as it is unclear which windows and parameters should/can be offered.  

Therefore, adding support for explicit window semantics to the SDK/Core would make implementing data processors and sinks using windows much easier and less error-prone.
h3. +*Tasks*+
 # Design and introduce new processor and controller classes for windowed event processors (e.g., WindowedDataProcessor) which handle the windowing logic internally and only expose the higher-level methods to users (i.e onCurrentEvent, onExpiredEvent, etc...).
 # Implement internal logic for few window functions (i.e TimeWindow, LengthWindow, TimeBatchWindow, LengthBatchWindow, etc...)
 # Write a few sample pipeline-elements using your new API!

h3. +*Relevant Skills*+
 * Basic knowledge in StreamPipes core (cloning the repo, going through the codebase/documents would do).
 * Basic knowledge of stream analytics window functions (this is not a must, but it's awesome if you know your way around analytics window functions).
 * Some Java experience.

h3. +*Learning Material*+

+For StreamPipes:+
 * [https://streampipes.apache.org/docs/]
 * [https://streampipes.apache.org/media.html]
 * [https://github.com/apache/incubator-streampipes]
 * [https://github.com/apache/incubator-streampipes-extensions]

+For Streaming Analytics:+
 * [https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-window-functions]
 * [https://www.mikulskibartosz.name/difference-between-tumbling-and-sliding-window/]
 * [https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.5/developing-storm-applications/content/understanding_sliding_and_tumbling_windows.html]

+For the context for the issue:+
 * [https://www.mail-archive.com/dev@streampipes.apache.org/msg00868.html]

+*Mentor*+
 - Grainier Perera.

  was:
h3. +*Apache StreamPipes*+

Apache StreamPipes (incubating) is a self-service (Industrial) IoT toolbox to enable non-technical users to connect, analyze and explore IoT data streams. StreamPipes offers several modules including StreamPipes Connect to easily connect data from industrial IoT sources, the Pipeline Editor to quickly create processing pipelines and several visualization modules for live and historic data exploration. Under the hood, StreamPipes utilizes an event-driven microservice paradigm of standalone, so-called analytics microservices making the system easy to extend for individual needs.
h3. +*Background*+

Currently, window logic can be individually defined per pipeline element. The whole windowing logic needs to be declared in the controller and runtime logic needs to be individually added based on the selected runtime wrapper (Java, Siddhi, Flink, etc...).

As many data processors benefit from using window-functions (i.e PEs such as Event Counter, Count Aggregation, Rate Limiter), windowing logic is often duplicated as it needs to be implemented for every new pipeline element. In addition, the feature set of supported window operators differs (and often depends on the developer) as it is unclear which windows and parameters should/can be offered.  

Therefore, adding support for explicit window semantics to the SDK/Core would make implementing data processors and sinks using windows much easier and less error-prone. 
+**+
h3. +*Tasks*+
 # Design and introduce new processor and controller classes for windowed event processors (e.g., WindowedDataProcessor) which handle the windowing logic internally and only expose the higher-level methods to users (i.e onCurrentEvent, onExpiredEvent, etc...).
 # Implement internal logic for few window functions (i.e TimeWindow, LengthWindow, TimeBatchWindow, LengthBatchWindow, etc...)
 # Write a few sample pipeline-elements using your new API!

h3. +*Relevant Skills*+
 * Basic knowledge in StreamPipes core (cloning the repo, going through the codebase/documents would do).
 * Basic knowledge of stream analytics window functions (this is not a must, but it's awesome if you know your way around analytics window functions).
 * Some Java experience.

h3. +*Learning Material*+

+For StreamPipes:+
 * [https://streampipes.apache.org/docs/]
 * [https://streampipes.apache.org/media.html]
 * [https://github.com/apache/incubator-streampipes]
 * [https://github.com/apache/incubator-streampipes-extensions]

+For Streaming Analytics:+
 * [https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-window-functions]
 * [https://www.mikulskibartosz.name/difference-between-tumbling-and-sliding-window/]
 * [https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.5/developing-storm-applications/content/understanding_sliding_and_tumbling_windows.html]

+For the context for the issue:+
 * [https://www.mail-archive.com/dev@streampipes.apache.org/msg00868.html]


+*Mentor*+

- Grainier Perera.


> Introduce event windowing to the StreamPipes core/sdk
> -----------------------------------------------------
>
>                 Key: STREAMPIPES-292
>                 URL: https://issues.apache.org/jira/browse/STREAMPIPES-292
>             Project: StreamPipes
>          Issue Type: New Feature
>          Components: Backend
>            Reporter: Grainier Perera
>            Priority: Major
>              Labels: gsoc2021, streampipes
>
> h3. +*Apache StreamPipes*+
> Apache StreamPipes (incubating) is a self-service (Industrial) IoT toolbox to enable non-technical users to connect, analyze and explore IoT data streams. StreamPipes offers several modules including StreamPipes Connect to easily connect data from industrial IoT sources, the Pipeline Editor to quickly create processing pipelines and several visualization modules for live and historic data exploration. Under the hood, StreamPipes utilizes an event-driven microservice paradigm of standalone, so-called analytics microservices making the system easy to extend for individual needs.
> h3. +*Background*+
> Currently, window logic can be individually defined per pipeline element. The whole windowing logic needs to be declared in the controller and runtime logic needs to be individually added based on the selected runtime wrapper (Java, Siddhi, Flink, etc...).
> As many data processors benefit from using window-functions (i.e PEs such as Event Counter, Count Aggregation, Rate Limiter), windowing logic is often duplicated as it needs to be implemented for every new pipeline element. In addition, the feature set of supported window operators differs (and often depends on the developer) as it is unclear which windows and parameters should/can be offered.  
> Therefore, adding support for explicit window semantics to the SDK/Core would make implementing data processors and sinks using windows much easier and less error-prone.
> h3. +*Tasks*+
>  # Design and introduce new processor and controller classes for windowed event processors (e.g., WindowedDataProcessor) which handle the windowing logic internally and only expose the higher-level methods to users (i.e onCurrentEvent, onExpiredEvent, etc...).
>  # Implement internal logic for few window functions (i.e TimeWindow, LengthWindow, TimeBatchWindow, LengthBatchWindow, etc...)
>  # Write a few sample pipeline-elements using your new API!
> h3. +*Relevant Skills*+
>  * Basic knowledge in StreamPipes core (cloning the repo, going through the codebase/documents would do).
>  * Basic knowledge of stream analytics window functions (this is not a must, but it's awesome if you know your way around analytics window functions).
>  * Some Java experience.
> h3. +*Learning Material*+
> +For StreamPipes:+
>  * [https://streampipes.apache.org/docs/]
>  * [https://streampipes.apache.org/media.html]
>  * [https://github.com/apache/incubator-streampipes]
>  * [https://github.com/apache/incubator-streampipes-extensions]
> +For Streaming Analytics:+
>  * [https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-window-functions]
>  * [https://www.mikulskibartosz.name/difference-between-tumbling-and-sliding-window/]
>  * [https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.5/developing-storm-applications/content/understanding_sliding_and_tumbling_windows.html]
> +For the context for the issue:+
>  * [https://www.mail-archive.com/dev@streampipes.apache.org/msg00868.html]
> +*Mentor*+
>  - Grainier Perera.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)