You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hama.apache.org by "Edward J. Yoon" <ed...@samsung.com> on 2014/12/17 00:59:55 UTC

Streaming, multi-BSP job pipelines, and streaming graph&incremental ML

Guys,

As you know, the trend already began to drift towards focusing on realtime and streaming instead of batch. To support streaming graph and incremental learning in Hama, I recently began a full-scale investigation about streaming data processing[1] and multi-BSP job pipelines[2].

Basically, the problem is how to process the unstructured input stream and transfer its output stream to the next "advanced streaming analytics" job without overheads. In here, there's also tricky issue in determining where should "new data" and "updates" be delivered. Some uses shared memory or only supports micro-batch algorithms, but we can efficiently and directly solve this problem by message-passing between multi jobs.

1. https://issues.apache.org/jira/browse/HAMA-883
2. https://issues.apache.org/jira/browse/HAMA-901

Re: Streaming, multi-BSP job pipelines, and streaming graph&incremental ML

Posted by Tommaso Teofili <to...@gmail.com>.
2014-12-17 0:59 GMT+01:00 Edward J. Yoon <ed...@samsung.com>:
>
> Guys,
>
> As you know, the trend already began to drift towards focusing on realtime
> and streaming instead of batch. To support streaming graph and incremental
> learning in Hama, I recently began a full-scale investigation about
> streaming data processing[1] and multi-BSP job pipelines[2].
>
> Basically, the problem is how to process the unstructured input stream and
> transfer its output stream to the next "advanced streaming analytics" job
> without overheads. In here, there's also tricky issue in determining where
> should "new data" and "updates" be delivered. Some uses shared memory or
> only supports micro-batch algorithms, but we can efficiently and directly
> solve this problem by message-passing between multi jobs.
>

agreed, message passing seems to me the right approach.

Regards,
Tommaso


>
> 1. https://issues.apache.org/jira/browse/HAMA-883
> 2. https://issues.apache.org/jira/browse/HAMA-901