You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@samoa.apache.org by Sandesh Hegde <sa...@datatorrent.com> on 2015/10/28 14:50:35 UTC

Re: Integration with Apache Samoa

Does it need iteration support?  Good idea to discuss this feature in both
the mailing list together.

Adding Samoa mailing list.

On Wed, Oct 28, 2015, 4:28 AM Sandeep Deshmukh <sa...@datatorrent.com>
wrote:

> +1
>
> Regards
> Sandeep
>
> Regards,
> Sandeep
>
> On Wed, Oct 28, 2015 at 11:36 AM, Amol Kekre <am...@datatorrent.com> wrote:
>
> > +1
> >
> > Amol
> >
> > On Tue, Oct 27, 2015 at 10:27 PM, Bhupesh Chawda <
> bhupesh@datatorrent.com>
> > wrote:
> >
> > > Hi All,
> > >
> > > Apache Samoa <https://samoa.incubator.apache.org/> is a distributed
> > > streaming machine learning framework that contains a programming
> > > abstraction for distributed streaming machine learning algorithms.
> Apache
> > > SAMOA enables development of new ML algorithms without directly dealing
> > > with the complexity of underlying distributed stream processing engines
> > > (DSPEe, such as Apache Storm, Apache S4, and Apache Samza). Apache
> SAMOA
> > > users can develop distributed streaming ML algorithms once and execute
> > them
> > > on multiple DSPEs.
> > >
> > > Apache Samoa currently has integrations with Apache Storm, Apache
> Flink,
> > > Apache S4 and Apache Samza. This means the ML algorithms developed on
> > > Apache Samoa can run on these platforms without any change in the
> > > algorithms.
> > > It would be a good idea to integrate Apache Apex as a distributed
> stream
> > > processing engine (DSPE) into Apache Samoa which would allow users to
> run
> > > ML algorithms developed in Samoa on Apache Apex.
> > >
> > > Here is the Apex JIRA for integration work:
> > > https://malhar.atlassian.net/browse/APEX-202
> > > Also, here is the JIRA in SAMOA project:
> > > https://issues.apache.org/jira/browse/SAMOA-49
> > >
> > > Thanks.
> > >
> >
>

Re: Integration with Apache Samoa

Posted by Mohit Jotwani <mo...@datatorrent.com>.
Finally! Congratulations!!

Very well done....

Regards,
Mohit

On Mon, Nov 7, 2016 at 2:55 PM, Bhupesh Chawda <bh...@datatorrent.com>
wrote:

> Hi All,
>
> The PR for making Apex a runner for SAMOA has been merged.
>
> Apache SAMOA now has an additional runner with Apache Apex -
> https://github.com/apache/incubator-samoa/tree/master/samoa-apex
>
> Thanks.
>
> ~ Bhupesh
>
> On Mon, Mar 28, 2016 at 11:05 AM, Bhupesh Chawda <bh...@datatorrent.com>
> wrote:
>
> > Hi All,
> >
> > Here is the status of the integration, since the last update:
> >
> >    - Launch process cleaned up. Launch still happening through DTCli
> >    calls. Once APEXCORE-405
> >    <https://issues.apache.org/jira/browse/APEXCORE-405> is implemented,
> >    this will become a lot better.
> >    - Some issues still causing problems in running Samoa apps on Apex.
> >    Containers getting killed. This may be due to low memory. This is
> work in
> >    progress.
> >
> > ~Bhupesh
> >
> > On Wed, Mar 2, 2016 at 10:14 AM, Bhupesh Chawda <bhupesh@datatorrent.com
> >
> > wrote:
> >
> >> Hi David,
> >>
> >> Here is the working branch you can look at:
> >> https://github.com/bhupeshchawda/incubator-samoa/tree/samoa-apex
> >>
> >> As I mentioned, the launch part needs to be worked on. Currently I have
> a
> >> few hacks in my local environment.
> >> You can use the test cases though to get an idea.
> >>
> >> -Bhupesh
> >>
> >> On Wed, Mar 2, 2016 at 1:01 AM, David Yan <da...@datatorrent.com>
> wrote:
> >>
> >>> Hi Bhupesh,
> >>>
> >>> That's good progress.  Can you send us a link to the code you did for
> >>> this?  Or maybe a review-only PR?
> >>>
> >>> David
> >>>
> >>> On Mon, Feb 29, 2016 at 10:15 PM, Bhupesh Chawda <
> >>> bhupesh@datatorrent.com>
> >>> wrote:
> >>>
> >>> > Hi All,
> >>> >
> >>> > Here is the status of integration of Apache Apex into Apache Samoa.
> >>> >
> >>> >    - Samoa API implemented and able to convert Samoa topology into
> Apex
> >>> > Dag.
> >>> >    - Implemented partitioning support using parallelism hints from
> >>> Samoa
> >>> >    API.
> >>> >    - Implemented stream multiplexing:
> >>> >       - Added All-based partitioner. Upstream tuples go to all
> >>> downstream
> >>> >       partitions
> >>> >       - Stream codec for Key based partitioning
> >>> >       - Stream codec for Random partitioning
> >>> >    - Able to launch a Samoa task on the cluster. This has to be
> worked
> >>> on.
> >>> >    Currently some hacks are used by calling DTCli explicitly from the
> >>> main
> >>> >    entry point in Samoa code. Also jars are needed to be manually
> >>> bundled.
> >>> >    This will be worked on in this sprint.
> >>> >    - Tested the following algorithms on local cluster:
> >>> >       - Prequential Evaluation using Vertical Hoeffding Tree
> >>> classifier.
> >>> >       This is a decision tree based classifier.
> >>> >       - Clustering using CluStream algorithm.
> >>> >    - I have asked clarifications on some more details of these
> >>> algorithms
> >>> >    as well as serialization issues with Samoa classes. I am waiting
> for
> >>> > some
> >>> >    response from the Samoa community.
> >>> >    - Although Samoa does not have many algorithms currently (it is a
> >>> >    framework for developing algorithms), more algorithms are expected
> >>> as a
> >>> >    part of their roadmap:
> >>> >    https://cwiki.apache.org/confluence/display/SAMOA/Roadmap
> >>> >
> >>> > Thanks,
> >>> >
> >>> > Bhupesh
> >>> >
> >>>
> >>
> >>
> >
>

Re: Integration with Apache Samoa

Posted by Sanjay Pujare <sa...@datatorrent.com>.
Yes, great job and impressive.

On 11/7/16, 11:23 AM, "Sandesh Hegde" <sa...@datatorrent.com> wrote:

    Good work Bhupesh.
    
    On Mon, Nov 7, 2016 at 11:17 AM David Yan <da...@datatorrent.com> wrote:
    
    > It took perseverance to get this merged, Good work Bhupesh!
    >
    > On Mon, Nov 7, 2016 at 1:25 AM, Bhupesh Chawda <bh...@datatorrent.com>
    > wrote:
    >
    > > Hi All,
    > >
    > > The PR for making Apex a runner for SAMOA has been merged.
    > >
    > > Apache SAMOA now has an additional runner with Apache Apex -
    > > https://github.com/apache/incubator-samoa/tree/master/samoa-apex
    > >
    > > Thanks.
    > >
    > > ~ Bhupesh
    > >
    > > On Mon, Mar 28, 2016 at 11:05 AM, Bhupesh Chawda <
    > bhupesh@datatorrent.com>
    > > wrote:
    > >
    > > > Hi All,
    > > >
    > > > Here is the status of the integration, since the last update:
    > > >
    > > >    - Launch process cleaned up. Launch still happening through DTCli
    > > >    calls. Once APEXCORE-405
    > > >    <https://issues.apache.org/jira/browse/APEXCORE-405> is
    > implemented,
    > > >    this will become a lot better.
    > > >    - Some issues still causing problems in running Samoa apps on Apex.
    > > >    Containers getting killed. This may be due to low memory. This is
    > > work in
    > > >    progress.
    > > >
    > > > ~Bhupesh
    > > >
    > > > On Wed, Mar 2, 2016 at 10:14 AM, Bhupesh Chawda <
    > bhupesh@datatorrent.com
    > > >
    > > > wrote:
    > > >
    > > >> Hi David,
    > > >>
    > > >> Here is the working branch you can look at:
    > > >> https://github.com/bhupeshchawda/incubator-samoa/tree/samoa-apex
    > > >>
    > > >> As I mentioned, the launch part needs to be worked on. Currently I
    > have
    > > a
    > > >> few hacks in my local environment.
    > > >> You can use the test cases though to get an idea.
    > > >>
    > > >> -Bhupesh
    > > >>
    > > >> On Wed, Mar 2, 2016 at 1:01 AM, David Yan <da...@datatorrent.com>
    > > wrote:
    > > >>
    > > >>> Hi Bhupesh,
    > > >>>
    > > >>> That's good progress.  Can you send us a link to the code you did for
    > > >>> this?  Or maybe a review-only PR?
    > > >>>
    > > >>> David
    > > >>>
    > > >>> On Mon, Feb 29, 2016 at 10:15 PM, Bhupesh Chawda <
    > > >>> bhupesh@datatorrent.com>
    > > >>> wrote:
    > > >>>
    > > >>> > Hi All,
    > > >>> >
    > > >>> > Here is the status of integration of Apache Apex into Apache Samoa.
    > > >>> >
    > > >>> >    - Samoa API implemented and able to convert Samoa topology into
    > > Apex
    > > >>> > Dag.
    > > >>> >    - Implemented partitioning support using parallelism hints from
    > > >>> Samoa
    > > >>> >    API.
    > > >>> >    - Implemented stream multiplexing:
    > > >>> >       - Added All-based partitioner. Upstream tuples go to all
    > > >>> downstream
    > > >>> >       partitions
    > > >>> >       - Stream codec for Key based partitioning
    > > >>> >       - Stream codec for Random partitioning
    > > >>> >    - Able to launch a Samoa task on the cluster. This has to be
    > > worked
    > > >>> on.
    > > >>> >    Currently some hacks are used by calling DTCli explicitly from
    > the
    > > >>> main
    > > >>> >    entry point in Samoa code. Also jars are needed to be manually
    > > >>> bundled.
    > > >>> >    This will be worked on in this sprint.
    > > >>> >    - Tested the following algorithms on local cluster:
    > > >>> >       - Prequential Evaluation using Vertical Hoeffding Tree
    > > >>> classifier.
    > > >>> >       This is a decision tree based classifier.
    > > >>> >       - Clustering using CluStream algorithm.
    > > >>> >    - I have asked clarifications on some more details of these
    > > >>> algorithms
    > > >>> >    as well as serialization issues with Samoa classes. I am waiting
    > > for
    > > >>> > some
    > > >>> >    response from the Samoa community.
    > > >>> >    - Although Samoa does not have many algorithms currently (it is
    > a
    > > >>> >    framework for developing algorithms), more algorithms are
    > expected
    > > >>> as a
    > > >>> >    part of their roadmap:
    > > >>> >    https://cwiki.apache.org/confluence/display/SAMOA/Roadmap
    > > >>> >
    > > >>> > Thanks,
    > > >>> >
    > > >>> > Bhupesh
    > > >>> >
    > > >>>
    > > >>
    > > >>
    > > >
    > >
    >
    



Re: Integration with Apache Samoa

Posted by Sandesh Hegde <sa...@datatorrent.com>.
Good work Bhupesh.

On Mon, Nov 7, 2016 at 11:17 AM David Yan <da...@datatorrent.com> wrote:

> It took perseverance to get this merged, Good work Bhupesh!
>
> On Mon, Nov 7, 2016 at 1:25 AM, Bhupesh Chawda <bh...@datatorrent.com>
> wrote:
>
> > Hi All,
> >
> > The PR for making Apex a runner for SAMOA has been merged.
> >
> > Apache SAMOA now has an additional runner with Apache Apex -
> > https://github.com/apache/incubator-samoa/tree/master/samoa-apex
> >
> > Thanks.
> >
> > ~ Bhupesh
> >
> > On Mon, Mar 28, 2016 at 11:05 AM, Bhupesh Chawda <
> bhupesh@datatorrent.com>
> > wrote:
> >
> > > Hi All,
> > >
> > > Here is the status of the integration, since the last update:
> > >
> > >    - Launch process cleaned up. Launch still happening through DTCli
> > >    calls. Once APEXCORE-405
> > >    <https://issues.apache.org/jira/browse/APEXCORE-405> is
> implemented,
> > >    this will become a lot better.
> > >    - Some issues still causing problems in running Samoa apps on Apex.
> > >    Containers getting killed. This may be due to low memory. This is
> > work in
> > >    progress.
> > >
> > > ~Bhupesh
> > >
> > > On Wed, Mar 2, 2016 at 10:14 AM, Bhupesh Chawda <
> bhupesh@datatorrent.com
> > >
> > > wrote:
> > >
> > >> Hi David,
> > >>
> > >> Here is the working branch you can look at:
> > >> https://github.com/bhupeshchawda/incubator-samoa/tree/samoa-apex
> > >>
> > >> As I mentioned, the launch part needs to be worked on. Currently I
> have
> > a
> > >> few hacks in my local environment.
> > >> You can use the test cases though to get an idea.
> > >>
> > >> -Bhupesh
> > >>
> > >> On Wed, Mar 2, 2016 at 1:01 AM, David Yan <da...@datatorrent.com>
> > wrote:
> > >>
> > >>> Hi Bhupesh,
> > >>>
> > >>> That's good progress.  Can you send us a link to the code you did for
> > >>> this?  Or maybe a review-only PR?
> > >>>
> > >>> David
> > >>>
> > >>> On Mon, Feb 29, 2016 at 10:15 PM, Bhupesh Chawda <
> > >>> bhupesh@datatorrent.com>
> > >>> wrote:
> > >>>
> > >>> > Hi All,
> > >>> >
> > >>> > Here is the status of integration of Apache Apex into Apache Samoa.
> > >>> >
> > >>> >    - Samoa API implemented and able to convert Samoa topology into
> > Apex
> > >>> > Dag.
> > >>> >    - Implemented partitioning support using parallelism hints from
> > >>> Samoa
> > >>> >    API.
> > >>> >    - Implemented stream multiplexing:
> > >>> >       - Added All-based partitioner. Upstream tuples go to all
> > >>> downstream
> > >>> >       partitions
> > >>> >       - Stream codec for Key based partitioning
> > >>> >       - Stream codec for Random partitioning
> > >>> >    - Able to launch a Samoa task on the cluster. This has to be
> > worked
> > >>> on.
> > >>> >    Currently some hacks are used by calling DTCli explicitly from
> the
> > >>> main
> > >>> >    entry point in Samoa code. Also jars are needed to be manually
> > >>> bundled.
> > >>> >    This will be worked on in this sprint.
> > >>> >    - Tested the following algorithms on local cluster:
> > >>> >       - Prequential Evaluation using Vertical Hoeffding Tree
> > >>> classifier.
> > >>> >       This is a decision tree based classifier.
> > >>> >       - Clustering using CluStream algorithm.
> > >>> >    - I have asked clarifications on some more details of these
> > >>> algorithms
> > >>> >    as well as serialization issues with Samoa classes. I am waiting
> > for
> > >>> > some
> > >>> >    response from the Samoa community.
> > >>> >    - Although Samoa does not have many algorithms currently (it is
> a
> > >>> >    framework for developing algorithms), more algorithms are
> expected
> > >>> as a
> > >>> >    part of their roadmap:
> > >>> >    https://cwiki.apache.org/confluence/display/SAMOA/Roadmap
> > >>> >
> > >>> > Thanks,
> > >>> >
> > >>> > Bhupesh
> > >>> >
> > >>>
> > >>
> > >>
> > >
> >
>

Re: Integration with Apache Samoa

Posted by David Yan <da...@datatorrent.com>.
It took perseverance to get this merged, Good work Bhupesh!

On Mon, Nov 7, 2016 at 1:25 AM, Bhupesh Chawda <bh...@datatorrent.com>
wrote:

> Hi All,
>
> The PR for making Apex a runner for SAMOA has been merged.
>
> Apache SAMOA now has an additional runner with Apache Apex -
> https://github.com/apache/incubator-samoa/tree/master/samoa-apex
>
> Thanks.
>
> ~ Bhupesh
>
> On Mon, Mar 28, 2016 at 11:05 AM, Bhupesh Chawda <bh...@datatorrent.com>
> wrote:
>
> > Hi All,
> >
> > Here is the status of the integration, since the last update:
> >
> >    - Launch process cleaned up. Launch still happening through DTCli
> >    calls. Once APEXCORE-405
> >    <https://issues.apache.org/jira/browse/APEXCORE-405> is implemented,
> >    this will become a lot better.
> >    - Some issues still causing problems in running Samoa apps on Apex.
> >    Containers getting killed. This may be due to low memory. This is
> work in
> >    progress.
> >
> > ~Bhupesh
> >
> > On Wed, Mar 2, 2016 at 10:14 AM, Bhupesh Chawda <bhupesh@datatorrent.com
> >
> > wrote:
> >
> >> Hi David,
> >>
> >> Here is the working branch you can look at:
> >> https://github.com/bhupeshchawda/incubator-samoa/tree/samoa-apex
> >>
> >> As I mentioned, the launch part needs to be worked on. Currently I have
> a
> >> few hacks in my local environment.
> >> You can use the test cases though to get an idea.
> >>
> >> -Bhupesh
> >>
> >> On Wed, Mar 2, 2016 at 1:01 AM, David Yan <da...@datatorrent.com>
> wrote:
> >>
> >>> Hi Bhupesh,
> >>>
> >>> That's good progress.  Can you send us a link to the code you did for
> >>> this?  Or maybe a review-only PR?
> >>>
> >>> David
> >>>
> >>> On Mon, Feb 29, 2016 at 10:15 PM, Bhupesh Chawda <
> >>> bhupesh@datatorrent.com>
> >>> wrote:
> >>>
> >>> > Hi All,
> >>> >
> >>> > Here is the status of integration of Apache Apex into Apache Samoa.
> >>> >
> >>> >    - Samoa API implemented and able to convert Samoa topology into
> Apex
> >>> > Dag.
> >>> >    - Implemented partitioning support using parallelism hints from
> >>> Samoa
> >>> >    API.
> >>> >    - Implemented stream multiplexing:
> >>> >       - Added All-based partitioner. Upstream tuples go to all
> >>> downstream
> >>> >       partitions
> >>> >       - Stream codec for Key based partitioning
> >>> >       - Stream codec for Random partitioning
> >>> >    - Able to launch a Samoa task on the cluster. This has to be
> worked
> >>> on.
> >>> >    Currently some hacks are used by calling DTCli explicitly from the
> >>> main
> >>> >    entry point in Samoa code. Also jars are needed to be manually
> >>> bundled.
> >>> >    This will be worked on in this sprint.
> >>> >    - Tested the following algorithms on local cluster:
> >>> >       - Prequential Evaluation using Vertical Hoeffding Tree
> >>> classifier.
> >>> >       This is a decision tree based classifier.
> >>> >       - Clustering using CluStream algorithm.
> >>> >    - I have asked clarifications on some more details of these
> >>> algorithms
> >>> >    as well as serialization issues with Samoa classes. I am waiting
> for
> >>> > some
> >>> >    response from the Samoa community.
> >>> >    - Although Samoa does not have many algorithms currently (it is a
> >>> >    framework for developing algorithms), more algorithms are expected
> >>> as a
> >>> >    part of their roadmap:
> >>> >    https://cwiki.apache.org/confluence/display/SAMOA/Roadmap
> >>> >
> >>> > Thanks,
> >>> >
> >>> > Bhupesh
> >>> >
> >>>
> >>
> >>
> >
>

Re: Integration with Apache Samoa

Posted by Bhupesh Chawda <bh...@datatorrent.com>.
Hi All,

The PR for making Apex a runner for SAMOA has been merged.

Apache SAMOA now has an additional runner with Apache Apex -
https://github.com/apache/incubator-samoa/tree/master/samoa-apex

Thanks.

~ Bhupesh

On Mon, Mar 28, 2016 at 11:05 AM, Bhupesh Chawda <bh...@datatorrent.com>
wrote:

> Hi All,
>
> Here is the status of the integration, since the last update:
>
>    - Launch process cleaned up. Launch still happening through DTCli
>    calls. Once APEXCORE-405
>    <https://issues.apache.org/jira/browse/APEXCORE-405> is implemented,
>    this will become a lot better.
>    - Some issues still causing problems in running Samoa apps on Apex.
>    Containers getting killed. This may be due to low memory. This is work in
>    progress.
>
> ~Bhupesh
>
> On Wed, Mar 2, 2016 at 10:14 AM, Bhupesh Chawda <bh...@datatorrent.com>
> wrote:
>
>> Hi David,
>>
>> Here is the working branch you can look at:
>> https://github.com/bhupeshchawda/incubator-samoa/tree/samoa-apex
>>
>> As I mentioned, the launch part needs to be worked on. Currently I have a
>> few hacks in my local environment.
>> You can use the test cases though to get an idea.
>>
>> -Bhupesh
>>
>> On Wed, Mar 2, 2016 at 1:01 AM, David Yan <da...@datatorrent.com> wrote:
>>
>>> Hi Bhupesh,
>>>
>>> That's good progress.  Can you send us a link to the code you did for
>>> this?  Or maybe a review-only PR?
>>>
>>> David
>>>
>>> On Mon, Feb 29, 2016 at 10:15 PM, Bhupesh Chawda <
>>> bhupesh@datatorrent.com>
>>> wrote:
>>>
>>> > Hi All,
>>> >
>>> > Here is the status of integration of Apache Apex into Apache Samoa.
>>> >
>>> >    - Samoa API implemented and able to convert Samoa topology into Apex
>>> > Dag.
>>> >    - Implemented partitioning support using parallelism hints from
>>> Samoa
>>> >    API.
>>> >    - Implemented stream multiplexing:
>>> >       - Added All-based partitioner. Upstream tuples go to all
>>> downstream
>>> >       partitions
>>> >       - Stream codec for Key based partitioning
>>> >       - Stream codec for Random partitioning
>>> >    - Able to launch a Samoa task on the cluster. This has to be worked
>>> on.
>>> >    Currently some hacks are used by calling DTCli explicitly from the
>>> main
>>> >    entry point in Samoa code. Also jars are needed to be manually
>>> bundled.
>>> >    This will be worked on in this sprint.
>>> >    - Tested the following algorithms on local cluster:
>>> >       - Prequential Evaluation using Vertical Hoeffding Tree
>>> classifier.
>>> >       This is a decision tree based classifier.
>>> >       - Clustering using CluStream algorithm.
>>> >    - I have asked clarifications on some more details of these
>>> algorithms
>>> >    as well as serialization issues with Samoa classes. I am waiting for
>>> > some
>>> >    response from the Samoa community.
>>> >    - Although Samoa does not have many algorithms currently (it is a
>>> >    framework for developing algorithms), more algorithms are expected
>>> as a
>>> >    part of their roadmap:
>>> >    https://cwiki.apache.org/confluence/display/SAMOA/Roadmap
>>> >
>>> > Thanks,
>>> >
>>> > Bhupesh
>>> >
>>>
>>
>>
>

Re: Integration with Apache Samoa

Posted by Bhupesh Chawda <bh...@datatorrent.com>.
Hi All,

Here is the status of the integration, since the last update:

   - Launch process cleaned up. Launch still happening through DTCli calls.
   Once APEXCORE-405 <https://issues.apache.org/jira/browse/APEXCORE-405>
   is implemented, this will become a lot better.
   - Some issues still causing problems in running Samoa apps on Apex.
   Containers getting killed. This may be due to low memory. This is work in
   progress.

~Bhupesh

On Wed, Mar 2, 2016 at 10:14 AM, Bhupesh Chawda <bh...@datatorrent.com>
wrote:

> Hi David,
>
> Here is the working branch you can look at:
> https://github.com/bhupeshchawda/incubator-samoa/tree/samoa-apex
>
> As I mentioned, the launch part needs to be worked on. Currently I have a
> few hacks in my local environment.
> You can use the test cases though to get an idea.
>
> -Bhupesh
>
> On Wed, Mar 2, 2016 at 1:01 AM, David Yan <da...@datatorrent.com> wrote:
>
>> Hi Bhupesh,
>>
>> That's good progress.  Can you send us a link to the code you did for
>> this?  Or maybe a review-only PR?
>>
>> David
>>
>> On Mon, Feb 29, 2016 at 10:15 PM, Bhupesh Chawda <bhupesh@datatorrent.com
>> >
>> wrote:
>>
>> > Hi All,
>> >
>> > Here is the status of integration of Apache Apex into Apache Samoa.
>> >
>> >    - Samoa API implemented and able to convert Samoa topology into Apex
>> > Dag.
>> >    - Implemented partitioning support using parallelism hints from Samoa
>> >    API.
>> >    - Implemented stream multiplexing:
>> >       - Added All-based partitioner. Upstream tuples go to all
>> downstream
>> >       partitions
>> >       - Stream codec for Key based partitioning
>> >       - Stream codec for Random partitioning
>> >    - Able to launch a Samoa task on the cluster. This has to be worked
>> on.
>> >    Currently some hacks are used by calling DTCli explicitly from the
>> main
>> >    entry point in Samoa code. Also jars are needed to be manually
>> bundled.
>> >    This will be worked on in this sprint.
>> >    - Tested the following algorithms on local cluster:
>> >       - Prequential Evaluation using Vertical Hoeffding Tree classifier.
>> >       This is a decision tree based classifier.
>> >       - Clustering using CluStream algorithm.
>> >    - I have asked clarifications on some more details of these
>> algorithms
>> >    as well as serialization issues with Samoa classes. I am waiting for
>> > some
>> >    response from the Samoa community.
>> >    - Although Samoa does not have many algorithms currently (it is a
>> >    framework for developing algorithms), more algorithms are expected
>> as a
>> >    part of their roadmap:
>> >    https://cwiki.apache.org/confluence/display/SAMOA/Roadmap
>> >
>> > Thanks,
>> >
>> > Bhupesh
>> >
>>
>
>

Re: Integration with Apache Samoa

Posted by Bhupesh Chawda <bh...@datatorrent.com>.
Hi David,

Here is the working branch you can look at:
https://github.com/bhupeshchawda/incubator-samoa/tree/samoa-apex

As I mentioned, the launch part needs to be worked on. Currently I have a
few hacks in my local environment.
You can use the test cases though to get an idea.

-Bhupesh

On Wed, Mar 2, 2016 at 1:01 AM, David Yan <da...@datatorrent.com> wrote:

> Hi Bhupesh,
>
> That's good progress.  Can you send us a link to the code you did for
> this?  Or maybe a review-only PR?
>
> David
>
> On Mon, Feb 29, 2016 at 10:15 PM, Bhupesh Chawda <bh...@datatorrent.com>
> wrote:
>
> > Hi All,
> >
> > Here is the status of integration of Apache Apex into Apache Samoa.
> >
> >    - Samoa API implemented and able to convert Samoa topology into Apex
> > Dag.
> >    - Implemented partitioning support using parallelism hints from Samoa
> >    API.
> >    - Implemented stream multiplexing:
> >       - Added All-based partitioner. Upstream tuples go to all downstream
> >       partitions
> >       - Stream codec for Key based partitioning
> >       - Stream codec for Random partitioning
> >    - Able to launch a Samoa task on the cluster. This has to be worked
> on.
> >    Currently some hacks are used by calling DTCli explicitly from the
> main
> >    entry point in Samoa code. Also jars are needed to be manually
> bundled.
> >    This will be worked on in this sprint.
> >    - Tested the following algorithms on local cluster:
> >       - Prequential Evaluation using Vertical Hoeffding Tree classifier.
> >       This is a decision tree based classifier.
> >       - Clustering using CluStream algorithm.
> >    - I have asked clarifications on some more details of these algorithms
> >    as well as serialization issues with Samoa classes. I am waiting for
> > some
> >    response from the Samoa community.
> >    - Although Samoa does not have many algorithms currently (it is a
> >    framework for developing algorithms), more algorithms are expected as
> a
> >    part of their roadmap:
> >    https://cwiki.apache.org/confluence/display/SAMOA/Roadmap
> >
> > Thanks,
> >
> > Bhupesh
> >
>

Re: Integration with Apache Samoa

Posted by David Yan <da...@datatorrent.com>.
Hi Bhupesh,

That's good progress.  Can you send us a link to the code you did for
this?  Or maybe a review-only PR?

David

On Mon, Feb 29, 2016 at 10:15 PM, Bhupesh Chawda <bh...@datatorrent.com>
wrote:

> Hi All,
>
> Here is the status of integration of Apache Apex into Apache Samoa.
>
>    - Samoa API implemented and able to convert Samoa topology into Apex
> Dag.
>    - Implemented partitioning support using parallelism hints from Samoa
>    API.
>    - Implemented stream multiplexing:
>       - Added All-based partitioner. Upstream tuples go to all downstream
>       partitions
>       - Stream codec for Key based partitioning
>       - Stream codec for Random partitioning
>    - Able to launch a Samoa task on the cluster. This has to be worked on.
>    Currently some hacks are used by calling DTCli explicitly from the main
>    entry point in Samoa code. Also jars are needed to be manually bundled.
>    This will be worked on in this sprint.
>    - Tested the following algorithms on local cluster:
>       - Prequential Evaluation using Vertical Hoeffding Tree classifier.
>       This is a decision tree based classifier.
>       - Clustering using CluStream algorithm.
>    - I have asked clarifications on some more details of these algorithms
>    as well as serialization issues with Samoa classes. I am waiting for
> some
>    response from the Samoa community.
>    - Although Samoa does not have many algorithms currently (it is a
>    framework for developing algorithms), more algorithms are expected as a
>    part of their roadmap:
>    https://cwiki.apache.org/confluence/display/SAMOA/Roadmap
>
> Thanks,
>
> Bhupesh
>

Re: Integration with Apache Samoa

Posted by Bhupesh Chawda <bh...@datatorrent.com>.
Hi All,

Here is the status of integration of Apache Apex into Apache Samoa.

   - Samoa API implemented and able to convert Samoa topology into Apex Dag.
   - Implemented partitioning support using parallelism hints from Samoa
   API.
   - Implemented stream multiplexing:
      - Added All-based partitioner. Upstream tuples go to all downstream
      partitions
      - Stream codec for Key based partitioning
      - Stream codec for Random partitioning
   - Able to launch a Samoa task on the cluster. This has to be worked on.
   Currently some hacks are used by calling DTCli explicitly from the main
   entry point in Samoa code. Also jars are needed to be manually bundled.
   This will be worked on in this sprint.
   - Tested the following algorithms on local cluster:
      - Prequential Evaluation using Vertical Hoeffding Tree classifier.
      This is a decision tree based classifier.
      - Clustering using CluStream algorithm.
   - I have asked clarifications on some more details of these algorithms
   as well as serialization issues with Samoa classes. I am waiting for some
   response from the Samoa community.
   - Although Samoa does not have many algorithms currently (it is a
   framework for developing algorithms), more algorithms are expected as a
   part of their roadmap:
   https://cwiki.apache.org/confluence/display/SAMOA/Roadmap

Thanks,

Bhupesh

Re: Integration with Apache Samoa

Posted by Bhupesh Chawda <bh...@datatorrent.com>.
Thanks all for the replies.
@Bind seems to be working for my case.

-Bhupesh

On Thu, Jan 14, 2016 at 1:03 PM, Thomas Weise <th...@datatorrent.com>
wrote:

> The suggestion wasn't to serialize an operator (which was already designed
> to work with Kryo), but those fields in the operator that may only be Java
> serializable. Please use the annotation with those fields.
>
> On Wed, Jan 13, 2016 at 11:22 PM, Tushar Gosavi <tu...@datatorrent.com>
> wrote:
>
> > One approach I found is that we could annotate operator
> > with @DefaultSerializer(JavaSerializer.class), but when operator is
> > de-serialized the constructor is not called, and port objects in
> operators
> > are left uninitialized (null). To handle this you will have to provide
> > readObject method which will initialize port object after object is read.
> >
> > something like below code
> > ```
> > @DefaultSerializer(JavaSerializer.class)
> > public abstract class BaseSinglePortOperator<A,B> extends BaseOperator
> > implements Serializable
> > {
> >   public transient DefaultOutputPort<B> output;
> >   public transient DefaultInputPort<A> input;
> >
> >   private void init() {
> >     output = new DefaultOutputPort<>();
> >
> >     input = new DefaultInputPort<A>() {
> >       @Override
> >       public void process(A tuple)
> >       {
> >         processTuple(tuple);
> >       }
> >     };
> >   }
> >
> >   protected abstract void processTuple(A tuple);
> >
> >   public BaseSinglePortOperator() {
> >     init();
> >   }
> >
> >   private void readObject(ObjectInputStream in) throws IOException,
> > ClassNotFoundException
> >   {
> >     in.defaultReadObject();
> >     init();
> >   }
> > }
> > ```
> >
> > - Tushar.
> >
> >
> > On Thu, Jan 14, 2016 at 10:51 AM, Gaurav Gupta <ga...@datatorrent.com>
> > wrote:
> >
> > > Bhupesh,
> > >
> > > You can get more details here
> > >
> >
> http://docs.datatorrent.com/troubleshooting/#application-throwing-following-kryo-exception
> > > <
> > >
> >
> http://docs.datatorrent.com/troubleshooting/#application-throwing-following-kryo-exception
> > > >
> > >
> > > Thanks
> > > - Gaurav
> > >
> > > > On Jan 12, 2016, at 9:36 AM, Gaurav Gupta <ga...@datatorrent.com>
> > > wrote:
> > > >
> > > > Bhupesh,
> > > >
> > > > There are following two ways
> > > >
> > > > 1. If you can change the classes, add default constructor to these
> > > classes.
> > > > 2. If you can’t change the classes, you can use custom serializer for
> > > these classes using Kryo’s @Bind annotation
> > > >
> > > > @Bind(JavaSerializer.class)
> > > > SetMultimap<String, String> someMap;
> > > >
> > > > This will work when there is an existing alternative serializer for
> the
> > > > type in question.
> > > >
> > > > Thanks
> > > > - Gaurav
> > > >
> > > >> On Jan 12, 2016, at 5:00 AM, Bhupesh Chawda <
> bhupesh@datatorrent.com
> > > <ma...@datatorrent.com>> wrote:
> > > >>
> > > >> Hi All,
> > > >>
> > > >> I am facing an issue with Kryo where some classes (which are a part
> of
> > > >> operators in Dag) in the imported jars do not have a zero-argument
> > > >> constructor. This results in a KryoException.
> > > >>
> > > >> Any suggestions on how to handle this?
> > > >> In general how can we deal with classes which do not have default
> > > >> constructors, without modifying them?
> > > >>
> > > >> Thanks.
> > > >> -Bhupesh
> > > >>
> > > >> On Thu, Oct 29, 2015 at 10:10 PM, Vlad Rozov <
> v.rozov@datatorrent.com
> > > <ma...@datatorrent.com>>
> > > >> wrote:
> > > >>
> > > >>> +1
> > > >>>
> > > >>>
> > > >>> On 10/29/15 08:47, Pramod Immaneni wrote:
> > > >>>
> > > >>>> +1
> > > >>>>
> > > >>>> On Thu, Oct 29, 2015 at 8:34 AM, Amol Kekre <amol@datatorrent.com
> > > <ma...@datatorrent.com>> wrote:
> > > >>>>
> > > >>>> Samoa can be used to test iteration support as that feature gets
> > > >>>>> developed.
> > > >>>>>
> > > >>>>> Amol
> > > >>>>>
> > > >>>>>
> > > >>>>> On Thu, Oct 29, 2015 at 5:12 AM, Bhupesh Chawda <
> > > bhupesh@datatorrent.com <ma...@datatorrent.com>
> > > >>>>>>
> > > >>>>> wrote:
> > > >>>>>
> > > >>>>> Yes, iteration support will be needed for quite a few algorithms.
> > > >>>>>>
> > > >>>>>> Thanks.
> > > >>>>>> Bhupesh
> > > >>>>>>
> > > >>>>>> On Wed, Oct 28, 2015 at 7:20 PM, Sandesh Hegde <
> > > sandesh@datatorrent.com <ma...@datatorrent.com>
> > > >>>>>>>
> > > >>>>>> wrote:
> > > >>>>>>
> > > >>>>>> Does it need iteration support?  Good idea to discuss this
> feature
> > > in
> > > >>>>>>>
> > > >>>>>> both
> > > >>>>>>
> > > >>>>>>> the mailing list together.
> > > >>>>>>>
> > > >>>>>>> Adding Samoa mailing list.
> > > >>>>>>>
> > > >>>>>>> On Wed, Oct 28, 2015, 4:28 AM Sandeep Deshmukh <
> > > >>>>>>>
> > > >>>>>> sandeep@datatorrent.com <ma...@datatorrent.com>>
> > > >>>>>
> > > >>>>>> wrote:
> > > >>>>>>>
> > > >>>>>>> +1
> > > >>>>>>>>
> > > >>>>>>>> Regards
> > > >>>>>>>> Sandeep
> > > >>>>>>>>
> > > >>>>>>>> Regards,
> > > >>>>>>>> Sandeep
> > > >>>>>>>>
> > > >>>>>>>> On Wed, Oct 28, 2015 at 11:36 AM, Amol Kekre <
> > > amol@datatorrent.com <ma...@datatorrent.com>>
> > > >>>>>>>>
> > > >>>>>>> wrote:
> > > >>>>>>>
> > > >>>>>>>> +1
> > > >>>>>>>>>
> > > >>>>>>>>> Amol
> > > >>>>>>>>>
> > > >>>>>>>>> On Tue, Oct 27, 2015 at 10:27 PM, Bhupesh Chawda <
> > > >>>>>>>>>
> > > >>>>>>>> bhupesh@datatorrent.com <ma...@datatorrent.com>>
> > > >>>>>>>>
> > > >>>>>>>>> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>> Hi All,
> > > >>>>>>>>>>
> > > >>>>>>>>>> Apache Samoa <https://samoa.incubator.apache.org/ <
> > > https://samoa.incubator.apache.org/>> is a
> > > >>>>>>>>>>
> > > >>>>>>>>> distributed
> > > >>>>>>
> > > >>>>>>> streaming machine learning framework that contains a
> programming
> > > >>>>>>>>>> abstraction for distributed streaming machine learning
> > > >>>>>>>>>>
> > > >>>>>>>>> algorithms.
> > > >>>>>
> > > >>>>>> Apache
> > > >>>>>>>>
> > > >>>>>>>>> SAMOA enables development of new ML algorithms without
> directly
> > > >>>>>>>>>>
> > > >>>>>>>>> dealing
> > > >>>>>>>
> > > >>>>>>>> with the complexity of underlying distributed stream
> processing
> > > >>>>>>>>>>
> > > >>>>>>>>> engines
> > > >>>>>>>
> > > >>>>>>>> (DSPEe, such as Apache Storm, Apache S4, and Apache Samza).
> > > >>>>>>>>>>
> > > >>>>>>>>> Apache
> > > >>>>>
> > > >>>>>> SAMOA
> > > >>>>>>>>
> > > >>>>>>>>> users can develop distributed streaming ML algorithms once
> and
> > > >>>>>>>>>>
> > > >>>>>>>>> execute
> > > >>>>>>>
> > > >>>>>>>> them
> > > >>>>>>>>>
> > > >>>>>>>>>> on multiple DSPEs.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Apache Samoa currently has integrations with Apache Storm,
> > > Apache
> > > >>>>>>>>>>
> > > >>>>>>>>> Flink,
> > > >>>>>>>>
> > > >>>>>>>>> Apache S4 and Apache Samza. This means the ML algorithms
> > > >>>>>>>>>>
> > > >>>>>>>>> developed
> > > >>>>>
> > > >>>>>> on
> > > >>>>>>
> > > >>>>>>> Apache Samoa can run on these platforms without any change in
> the
> > > >>>>>>>>>> algorithms.
> > > >>>>>>>>>> It would be a good idea to integrate Apache Apex as a
> > > distributed
> > > >>>>>>>>>>
> > > >>>>>>>>> stream
> > > >>>>>>>>
> > > >>>>>>>>> processing engine (DSPE) into Apache Samoa which would allow
> > > >>>>>>>>>>
> > > >>>>>>>>> users
> > > >>>>>
> > > >>>>>> to
> > > >>>>>>
> > > >>>>>>> run
> > > >>>>>>>>
> > > >>>>>>>>> ML algorithms developed in Samoa on Apache Apex.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Here is the Apex JIRA for integration work:
> > > >>>>>>>>>> https://malhar.atlassian.net/browse/APEX-202 <
> > > https://malhar.atlassian.net/browse/APEX-202>
> > > >>>>>>>>>> Also, here is the JIRA in SAMOA project:
> > > >>>>>>>>>> https://issues.apache.org/jira/browse/SAMOA-49
> > > >>>>>>>>>>
> > > >>>>>>>>>> Thanks.
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>
> > > >
> > >
> > >
> >
>

Re: Integration with Apache Samoa

Posted by Thomas Weise <th...@datatorrent.com>.
The suggestion wasn't to serialize an operator (which was already designed
to work with Kryo), but those fields in the operator that may only be Java
serializable. Please use the annotation with those fields.

On Wed, Jan 13, 2016 at 11:22 PM, Tushar Gosavi <tu...@datatorrent.com>
wrote:

> One approach I found is that we could annotate operator
> with @DefaultSerializer(JavaSerializer.class), but when operator is
> de-serialized the constructor is not called, and port objects in operators
> are left uninitialized (null). To handle this you will have to provide
> readObject method which will initialize port object after object is read.
>
> something like below code
> ```
> @DefaultSerializer(JavaSerializer.class)
> public abstract class BaseSinglePortOperator<A,B> extends BaseOperator
> implements Serializable
> {
>   public transient DefaultOutputPort<B> output;
>   public transient DefaultInputPort<A> input;
>
>   private void init() {
>     output = new DefaultOutputPort<>();
>
>     input = new DefaultInputPort<A>() {
>       @Override
>       public void process(A tuple)
>       {
>         processTuple(tuple);
>       }
>     };
>   }
>
>   protected abstract void processTuple(A tuple);
>
>   public BaseSinglePortOperator() {
>     init();
>   }
>
>   private void readObject(ObjectInputStream in) throws IOException,
> ClassNotFoundException
>   {
>     in.defaultReadObject();
>     init();
>   }
> }
> ```
>
> - Tushar.
>
>
> On Thu, Jan 14, 2016 at 10:51 AM, Gaurav Gupta <ga...@datatorrent.com>
> wrote:
>
> > Bhupesh,
> >
> > You can get more details here
> >
> http://docs.datatorrent.com/troubleshooting/#application-throwing-following-kryo-exception
> > <
> >
> http://docs.datatorrent.com/troubleshooting/#application-throwing-following-kryo-exception
> > >
> >
> > Thanks
> > - Gaurav
> >
> > > On Jan 12, 2016, at 9:36 AM, Gaurav Gupta <ga...@datatorrent.com>
> > wrote:
> > >
> > > Bhupesh,
> > >
> > > There are following two ways
> > >
> > > 1. If you can change the classes, add default constructor to these
> > classes.
> > > 2. If you can’t change the classes, you can use custom serializer for
> > these classes using Kryo’s @Bind annotation
> > >
> > > @Bind(JavaSerializer.class)
> > > SetMultimap<String, String> someMap;
> > >
> > > This will work when there is an existing alternative serializer for the
> > > type in question.
> > >
> > > Thanks
> > > - Gaurav
> > >
> > >> On Jan 12, 2016, at 5:00 AM, Bhupesh Chawda <bhupesh@datatorrent.com
> > <ma...@datatorrent.com>> wrote:
> > >>
> > >> Hi All,
> > >>
> > >> I am facing an issue with Kryo where some classes (which are a part of
> > >> operators in Dag) in the imported jars do not have a zero-argument
> > >> constructor. This results in a KryoException.
> > >>
> > >> Any suggestions on how to handle this?
> > >> In general how can we deal with classes which do not have default
> > >> constructors, without modifying them?
> > >>
> > >> Thanks.
> > >> -Bhupesh
> > >>
> > >> On Thu, Oct 29, 2015 at 10:10 PM, Vlad Rozov <v.rozov@datatorrent.com
> > <ma...@datatorrent.com>>
> > >> wrote:
> > >>
> > >>> +1
> > >>>
> > >>>
> > >>> On 10/29/15 08:47, Pramod Immaneni wrote:
> > >>>
> > >>>> +1
> > >>>>
> > >>>> On Thu, Oct 29, 2015 at 8:34 AM, Amol Kekre <amol@datatorrent.com
> > <ma...@datatorrent.com>> wrote:
> > >>>>
> > >>>> Samoa can be used to test iteration support as that feature gets
> > >>>>> developed.
> > >>>>>
> > >>>>> Amol
> > >>>>>
> > >>>>>
> > >>>>> On Thu, Oct 29, 2015 at 5:12 AM, Bhupesh Chawda <
> > bhupesh@datatorrent.com <ma...@datatorrent.com>
> > >>>>>>
> > >>>>> wrote:
> > >>>>>
> > >>>>> Yes, iteration support will be needed for quite a few algorithms.
> > >>>>>>
> > >>>>>> Thanks.
> > >>>>>> Bhupesh
> > >>>>>>
> > >>>>>> On Wed, Oct 28, 2015 at 7:20 PM, Sandesh Hegde <
> > sandesh@datatorrent.com <ma...@datatorrent.com>
> > >>>>>>>
> > >>>>>> wrote:
> > >>>>>>
> > >>>>>> Does it need iteration support?  Good idea to discuss this feature
> > in
> > >>>>>>>
> > >>>>>> both
> > >>>>>>
> > >>>>>>> the mailing list together.
> > >>>>>>>
> > >>>>>>> Adding Samoa mailing list.
> > >>>>>>>
> > >>>>>>> On Wed, Oct 28, 2015, 4:28 AM Sandeep Deshmukh <
> > >>>>>>>
> > >>>>>> sandeep@datatorrent.com <ma...@datatorrent.com>>
> > >>>>>
> > >>>>>> wrote:
> > >>>>>>>
> > >>>>>>> +1
> > >>>>>>>>
> > >>>>>>>> Regards
> > >>>>>>>> Sandeep
> > >>>>>>>>
> > >>>>>>>> Regards,
> > >>>>>>>> Sandeep
> > >>>>>>>>
> > >>>>>>>> On Wed, Oct 28, 2015 at 11:36 AM, Amol Kekre <
> > amol@datatorrent.com <ma...@datatorrent.com>>
> > >>>>>>>>
> > >>>>>>> wrote:
> > >>>>>>>
> > >>>>>>>> +1
> > >>>>>>>>>
> > >>>>>>>>> Amol
> > >>>>>>>>>
> > >>>>>>>>> On Tue, Oct 27, 2015 at 10:27 PM, Bhupesh Chawda <
> > >>>>>>>>>
> > >>>>>>>> bhupesh@datatorrent.com <ma...@datatorrent.com>>
> > >>>>>>>>
> > >>>>>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>> Hi All,
> > >>>>>>>>>>
> > >>>>>>>>>> Apache Samoa <https://samoa.incubator.apache.org/ <
> > https://samoa.incubator.apache.org/>> is a
> > >>>>>>>>>>
> > >>>>>>>>> distributed
> > >>>>>>
> > >>>>>>> streaming machine learning framework that contains a programming
> > >>>>>>>>>> abstraction for distributed streaming machine learning
> > >>>>>>>>>>
> > >>>>>>>>> algorithms.
> > >>>>>
> > >>>>>> Apache
> > >>>>>>>>
> > >>>>>>>>> SAMOA enables development of new ML algorithms without directly
> > >>>>>>>>>>
> > >>>>>>>>> dealing
> > >>>>>>>
> > >>>>>>>> with the complexity of underlying distributed stream processing
> > >>>>>>>>>>
> > >>>>>>>>> engines
> > >>>>>>>
> > >>>>>>>> (DSPEe, such as Apache Storm, Apache S4, and Apache Samza).
> > >>>>>>>>>>
> > >>>>>>>>> Apache
> > >>>>>
> > >>>>>> SAMOA
> > >>>>>>>>
> > >>>>>>>>> users can develop distributed streaming ML algorithms once and
> > >>>>>>>>>>
> > >>>>>>>>> execute
> > >>>>>>>
> > >>>>>>>> them
> > >>>>>>>>>
> > >>>>>>>>>> on multiple DSPEs.
> > >>>>>>>>>>
> > >>>>>>>>>> Apache Samoa currently has integrations with Apache Storm,
> > Apache
> > >>>>>>>>>>
> > >>>>>>>>> Flink,
> > >>>>>>>>
> > >>>>>>>>> Apache S4 and Apache Samza. This means the ML algorithms
> > >>>>>>>>>>
> > >>>>>>>>> developed
> > >>>>>
> > >>>>>> on
> > >>>>>>
> > >>>>>>> Apache Samoa can run on these platforms without any change in the
> > >>>>>>>>>> algorithms.
> > >>>>>>>>>> It would be a good idea to integrate Apache Apex as a
> > distributed
> > >>>>>>>>>>
> > >>>>>>>>> stream
> > >>>>>>>>
> > >>>>>>>>> processing engine (DSPE) into Apache Samoa which would allow
> > >>>>>>>>>>
> > >>>>>>>>> users
> > >>>>>
> > >>>>>> to
> > >>>>>>
> > >>>>>>> run
> > >>>>>>>>
> > >>>>>>>>> ML algorithms developed in Samoa on Apache Apex.
> > >>>>>>>>>>
> > >>>>>>>>>> Here is the Apex JIRA for integration work:
> > >>>>>>>>>> https://malhar.atlassian.net/browse/APEX-202 <
> > https://malhar.atlassian.net/browse/APEX-202>
> > >>>>>>>>>> Also, here is the JIRA in SAMOA project:
> > >>>>>>>>>> https://issues.apache.org/jira/browse/SAMOA-49
> > >>>>>>>>>>
> > >>>>>>>>>> Thanks.
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>
> > >
> >
> >
>

Re: Integration with Apache Samoa

Posted by Tushar Gosavi <tu...@datatorrent.com>.
One approach I found is that we could annotate operator
with @DefaultSerializer(JavaSerializer.class), but when operator is
de-serialized the constructor is not called, and port objects in operators
are left uninitialized (null). To handle this you will have to provide
readObject method which will initialize port object after object is read.

something like below code
```
@DefaultSerializer(JavaSerializer.class)
public abstract class BaseSinglePortOperator<A,B> extends BaseOperator
implements Serializable
{
  public transient DefaultOutputPort<B> output;
  public transient DefaultInputPort<A> input;

  private void init() {
    output = new DefaultOutputPort<>();

    input = new DefaultInputPort<A>() {
      @Override
      public void process(A tuple)
      {
        processTuple(tuple);
      }
    };
  }

  protected abstract void processTuple(A tuple);

  public BaseSinglePortOperator() {
    init();
  }

  private void readObject(ObjectInputStream in) throws IOException,
ClassNotFoundException
  {
    in.defaultReadObject();
    init();
  }
}
```

- Tushar.


On Thu, Jan 14, 2016 at 10:51 AM, Gaurav Gupta <ga...@datatorrent.com>
wrote:

> Bhupesh,
>
> You can get more details here
> http://docs.datatorrent.com/troubleshooting/#application-throwing-following-kryo-exception
> <
> http://docs.datatorrent.com/troubleshooting/#application-throwing-following-kryo-exception
> >
>
> Thanks
> - Gaurav
>
> > On Jan 12, 2016, at 9:36 AM, Gaurav Gupta <ga...@datatorrent.com>
> wrote:
> >
> > Bhupesh,
> >
> > There are following two ways
> >
> > 1. If you can change the classes, add default constructor to these
> classes.
> > 2. If you can’t change the classes, you can use custom serializer for
> these classes using Kryo’s @Bind annotation
> >
> > @Bind(JavaSerializer.class)
> > SetMultimap<String, String> someMap;
> >
> > This will work when there is an existing alternative serializer for the
> > type in question.
> >
> > Thanks
> > - Gaurav
> >
> >> On Jan 12, 2016, at 5:00 AM, Bhupesh Chawda <bhupesh@datatorrent.com
> <ma...@datatorrent.com>> wrote:
> >>
> >> Hi All,
> >>
> >> I am facing an issue with Kryo where some classes (which are a part of
> >> operators in Dag) in the imported jars do not have a zero-argument
> >> constructor. This results in a KryoException.
> >>
> >> Any suggestions on how to handle this?
> >> In general how can we deal with classes which do not have default
> >> constructors, without modifying them?
> >>
> >> Thanks.
> >> -Bhupesh
> >>
> >> On Thu, Oct 29, 2015 at 10:10 PM, Vlad Rozov <v.rozov@datatorrent.com
> <ma...@datatorrent.com>>
> >> wrote:
> >>
> >>> +1
> >>>
> >>>
> >>> On 10/29/15 08:47, Pramod Immaneni wrote:
> >>>
> >>>> +1
> >>>>
> >>>> On Thu, Oct 29, 2015 at 8:34 AM, Amol Kekre <amol@datatorrent.com
> <ma...@datatorrent.com>> wrote:
> >>>>
> >>>> Samoa can be used to test iteration support as that feature gets
> >>>>> developed.
> >>>>>
> >>>>> Amol
> >>>>>
> >>>>>
> >>>>> On Thu, Oct 29, 2015 at 5:12 AM, Bhupesh Chawda <
> bhupesh@datatorrent.com <ma...@datatorrent.com>
> >>>>>>
> >>>>> wrote:
> >>>>>
> >>>>> Yes, iteration support will be needed for quite a few algorithms.
> >>>>>>
> >>>>>> Thanks.
> >>>>>> Bhupesh
> >>>>>>
> >>>>>> On Wed, Oct 28, 2015 at 7:20 PM, Sandesh Hegde <
> sandesh@datatorrent.com <ma...@datatorrent.com>
> >>>>>>>
> >>>>>> wrote:
> >>>>>>
> >>>>>> Does it need iteration support?  Good idea to discuss this feature
> in
> >>>>>>>
> >>>>>> both
> >>>>>>
> >>>>>>> the mailing list together.
> >>>>>>>
> >>>>>>> Adding Samoa mailing list.
> >>>>>>>
> >>>>>>> On Wed, Oct 28, 2015, 4:28 AM Sandeep Deshmukh <
> >>>>>>>
> >>>>>> sandeep@datatorrent.com <ma...@datatorrent.com>>
> >>>>>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>> +1
> >>>>>>>>
> >>>>>>>> Regards
> >>>>>>>> Sandeep
> >>>>>>>>
> >>>>>>>> Regards,
> >>>>>>>> Sandeep
> >>>>>>>>
> >>>>>>>> On Wed, Oct 28, 2015 at 11:36 AM, Amol Kekre <
> amol@datatorrent.com <ma...@datatorrent.com>>
> >>>>>>>>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> +1
> >>>>>>>>>
> >>>>>>>>> Amol
> >>>>>>>>>
> >>>>>>>>> On Tue, Oct 27, 2015 at 10:27 PM, Bhupesh Chawda <
> >>>>>>>>>
> >>>>>>>> bhupesh@datatorrent.com <ma...@datatorrent.com>>
> >>>>>>>>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>> Hi All,
> >>>>>>>>>>
> >>>>>>>>>> Apache Samoa <https://samoa.incubator.apache.org/ <
> https://samoa.incubator.apache.org/>> is a
> >>>>>>>>>>
> >>>>>>>>> distributed
> >>>>>>
> >>>>>>> streaming machine learning framework that contains a programming
> >>>>>>>>>> abstraction for distributed streaming machine learning
> >>>>>>>>>>
> >>>>>>>>> algorithms.
> >>>>>
> >>>>>> Apache
> >>>>>>>>
> >>>>>>>>> SAMOA enables development of new ML algorithms without directly
> >>>>>>>>>>
> >>>>>>>>> dealing
> >>>>>>>
> >>>>>>>> with the complexity of underlying distributed stream processing
> >>>>>>>>>>
> >>>>>>>>> engines
> >>>>>>>
> >>>>>>>> (DSPEe, such as Apache Storm, Apache S4, and Apache Samza).
> >>>>>>>>>>
> >>>>>>>>> Apache
> >>>>>
> >>>>>> SAMOA
> >>>>>>>>
> >>>>>>>>> users can develop distributed streaming ML algorithms once and
> >>>>>>>>>>
> >>>>>>>>> execute
> >>>>>>>
> >>>>>>>> them
> >>>>>>>>>
> >>>>>>>>>> on multiple DSPEs.
> >>>>>>>>>>
> >>>>>>>>>> Apache Samoa currently has integrations with Apache Storm,
> Apache
> >>>>>>>>>>
> >>>>>>>>> Flink,
> >>>>>>>>
> >>>>>>>>> Apache S4 and Apache Samza. This means the ML algorithms
> >>>>>>>>>>
> >>>>>>>>> developed
> >>>>>
> >>>>>> on
> >>>>>>
> >>>>>>> Apache Samoa can run on these platforms without any change in the
> >>>>>>>>>> algorithms.
> >>>>>>>>>> It would be a good idea to integrate Apache Apex as a
> distributed
> >>>>>>>>>>
> >>>>>>>>> stream
> >>>>>>>>
> >>>>>>>>> processing engine (DSPE) into Apache Samoa which would allow
> >>>>>>>>>>
> >>>>>>>>> users
> >>>>>
> >>>>>> to
> >>>>>>
> >>>>>>> run
> >>>>>>>>
> >>>>>>>>> ML algorithms developed in Samoa on Apache Apex.
> >>>>>>>>>>
> >>>>>>>>>> Here is the Apex JIRA for integration work:
> >>>>>>>>>> https://malhar.atlassian.net/browse/APEX-202 <
> https://malhar.atlassian.net/browse/APEX-202>
> >>>>>>>>>> Also, here is the JIRA in SAMOA project:
> >>>>>>>>>> https://issues.apache.org/jira/browse/SAMOA-49
> >>>>>>>>>>
> >>>>>>>>>> Thanks.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>
> >
>
>

Re: Integration with Apache Samoa

Posted by Gaurav Gupta <ga...@datatorrent.com>.
Bhupesh,

You can get more details here http://docs.datatorrent.com/troubleshooting/#application-throwing-following-kryo-exception <http://docs.datatorrent.com/troubleshooting/#application-throwing-following-kryo-exception>

Thanks
- Gaurav

> On Jan 12, 2016, at 9:36 AM, Gaurav Gupta <ga...@datatorrent.com> wrote:
> 
> Bhupesh,
> 
> There are following two ways
> 
> 1. If you can change the classes, add default constructor to these classes. 
> 2. If you can’t change the classes, you can use custom serializer for these classes using Kryo’s @Bind annotation
> 
> @Bind(JavaSerializer.class)
> SetMultimap<String, String> someMap;
> 
> This will work when there is an existing alternative serializer for the
> type in question.
> 
> Thanks
> - Gaurav
> 
>> On Jan 12, 2016, at 5:00 AM, Bhupesh Chawda <bhupesh@datatorrent.com <ma...@datatorrent.com>> wrote:
>> 
>> Hi All,
>> 
>> I am facing an issue with Kryo where some classes (which are a part of
>> operators in Dag) in the imported jars do not have a zero-argument
>> constructor. This results in a KryoException.
>> 
>> Any suggestions on how to handle this?
>> In general how can we deal with classes which do not have default
>> constructors, without modifying them?
>> 
>> Thanks.
>> -Bhupesh
>> 
>> On Thu, Oct 29, 2015 at 10:10 PM, Vlad Rozov <v.rozov@datatorrent.com <ma...@datatorrent.com>>
>> wrote:
>> 
>>> +1
>>> 
>>> 
>>> On 10/29/15 08:47, Pramod Immaneni wrote:
>>> 
>>>> +1
>>>> 
>>>> On Thu, Oct 29, 2015 at 8:34 AM, Amol Kekre <amol@datatorrent.com <ma...@datatorrent.com>> wrote:
>>>> 
>>>> Samoa can be used to test iteration support as that feature gets
>>>>> developed.
>>>>> 
>>>>> Amol
>>>>> 
>>>>> 
>>>>> On Thu, Oct 29, 2015 at 5:12 AM, Bhupesh Chawda <bhupesh@datatorrent.com <ma...@datatorrent.com>
>>>>>> 
>>>>> wrote:
>>>>> 
>>>>> Yes, iteration support will be needed for quite a few algorithms.
>>>>>> 
>>>>>> Thanks.
>>>>>> Bhupesh
>>>>>> 
>>>>>> On Wed, Oct 28, 2015 at 7:20 PM, Sandesh Hegde <sandesh@datatorrent.com <ma...@datatorrent.com>
>>>>>>> 
>>>>>> wrote:
>>>>>> 
>>>>>> Does it need iteration support?  Good idea to discuss this feature in
>>>>>>> 
>>>>>> both
>>>>>> 
>>>>>>> the mailing list together.
>>>>>>> 
>>>>>>> Adding Samoa mailing list.
>>>>>>> 
>>>>>>> On Wed, Oct 28, 2015, 4:28 AM Sandeep Deshmukh <
>>>>>>> 
>>>>>> sandeep@datatorrent.com <ma...@datatorrent.com>>
>>>>> 
>>>>>> wrote:
>>>>>>> 
>>>>>>> +1
>>>>>>>> 
>>>>>>>> Regards
>>>>>>>> Sandeep
>>>>>>>> 
>>>>>>>> Regards,
>>>>>>>> Sandeep
>>>>>>>> 
>>>>>>>> On Wed, Oct 28, 2015 at 11:36 AM, Amol Kekre <amol@datatorrent.com <ma...@datatorrent.com>>
>>>>>>>> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> +1
>>>>>>>>> 
>>>>>>>>> Amol
>>>>>>>>> 
>>>>>>>>> On Tue, Oct 27, 2015 at 10:27 PM, Bhupesh Chawda <
>>>>>>>>> 
>>>>>>>> bhupesh@datatorrent.com <ma...@datatorrent.com>>
>>>>>>>> 
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>> Hi All,
>>>>>>>>>> 
>>>>>>>>>> Apache Samoa <https://samoa.incubator.apache.org/ <https://samoa.incubator.apache.org/>> is a
>>>>>>>>>> 
>>>>>>>>> distributed
>>>>>> 
>>>>>>> streaming machine learning framework that contains a programming
>>>>>>>>>> abstraction for distributed streaming machine learning
>>>>>>>>>> 
>>>>>>>>> algorithms.
>>>>> 
>>>>>> Apache
>>>>>>>> 
>>>>>>>>> SAMOA enables development of new ML algorithms without directly
>>>>>>>>>> 
>>>>>>>>> dealing
>>>>>>> 
>>>>>>>> with the complexity of underlying distributed stream processing
>>>>>>>>>> 
>>>>>>>>> engines
>>>>>>> 
>>>>>>>> (DSPEe, such as Apache Storm, Apache S4, and Apache Samza).
>>>>>>>>>> 
>>>>>>>>> Apache
>>>>> 
>>>>>> SAMOA
>>>>>>>> 
>>>>>>>>> users can develop distributed streaming ML algorithms once and
>>>>>>>>>> 
>>>>>>>>> execute
>>>>>>> 
>>>>>>>> them
>>>>>>>>> 
>>>>>>>>>> on multiple DSPEs.
>>>>>>>>>> 
>>>>>>>>>> Apache Samoa currently has integrations with Apache Storm, Apache
>>>>>>>>>> 
>>>>>>>>> Flink,
>>>>>>>> 
>>>>>>>>> Apache S4 and Apache Samza. This means the ML algorithms
>>>>>>>>>> 
>>>>>>>>> developed
>>>>> 
>>>>>> on
>>>>>> 
>>>>>>> Apache Samoa can run on these platforms without any change in the
>>>>>>>>>> algorithms.
>>>>>>>>>> It would be a good idea to integrate Apache Apex as a distributed
>>>>>>>>>> 
>>>>>>>>> stream
>>>>>>>> 
>>>>>>>>> processing engine (DSPE) into Apache Samoa which would allow
>>>>>>>>>> 
>>>>>>>>> users
>>>>> 
>>>>>> to
>>>>>> 
>>>>>>> run
>>>>>>>> 
>>>>>>>>> ML algorithms developed in Samoa on Apache Apex.
>>>>>>>>>> 
>>>>>>>>>> Here is the Apex JIRA for integration work:
>>>>>>>>>> https://malhar.atlassian.net/browse/APEX-202 <https://malhar.atlassian.net/browse/APEX-202>
>>>>>>>>>> Also, here is the JIRA in SAMOA project:
>>>>>>>>>> https://issues.apache.org/jira/browse/SAMOA-49
>>>>>>>>>> 
>>>>>>>>>> Thanks.
>>>>>>>>>> 
>>>>>>>>>> 
>>> 
> 


Re: Integration with Apache Samoa

Posted by Gaurav Gupta <ga...@datatorrent.com>.
Bhupesh,

There are following two ways

1. If you can change the classes, add default constructor to these classes. 
2. If you can’t change the classes, you can use custom serializer for these classes using Kryo’s @Bind annotation

@Bind(JavaSerializer.class)
SetMultimap<String, String> someMap;

This will work when there is an existing alternative serializer for the
type in question.

Thanks
- Gaurav

> On Jan 12, 2016, at 5:00 AM, Bhupesh Chawda <bh...@datatorrent.com> wrote:
> 
> Hi All,
> 
> I am facing an issue with Kryo where some classes (which are a part of
> operators in Dag) in the imported jars do not have a zero-argument
> constructor. This results in a KryoException.
> 
> Any suggestions on how to handle this?
> In general how can we deal with classes which do not have default
> constructors, without modifying them?
> 
> Thanks.
> -Bhupesh
> 
> On Thu, Oct 29, 2015 at 10:10 PM, Vlad Rozov <v....@datatorrent.com>
> wrote:
> 
>> +1
>> 
>> 
>> On 10/29/15 08:47, Pramod Immaneni wrote:
>> 
>>> +1
>>> 
>>> On Thu, Oct 29, 2015 at 8:34 AM, Amol Kekre <am...@datatorrent.com> wrote:
>>> 
>>> Samoa can be used to test iteration support as that feature gets
>>>> developed.
>>>> 
>>>> Amol
>>>> 
>>>> 
>>>> On Thu, Oct 29, 2015 at 5:12 AM, Bhupesh Chawda <bhupesh@datatorrent.com
>>>>> 
>>>> wrote:
>>>> 
>>>> Yes, iteration support will be needed for quite a few algorithms.
>>>>> 
>>>>> Thanks.
>>>>> Bhupesh
>>>>> 
>>>>> On Wed, Oct 28, 2015 at 7:20 PM, Sandesh Hegde <sandesh@datatorrent.com
>>>>>> 
>>>>> wrote:
>>>>> 
>>>>> Does it need iteration support?  Good idea to discuss this feature in
>>>>>> 
>>>>> both
>>>>> 
>>>>>> the mailing list together.
>>>>>> 
>>>>>> Adding Samoa mailing list.
>>>>>> 
>>>>>> On Wed, Oct 28, 2015, 4:28 AM Sandeep Deshmukh <
>>>>>> 
>>>>> sandeep@datatorrent.com>
>>>> 
>>>>> wrote:
>>>>>> 
>>>>>> +1
>>>>>>> 
>>>>>>> Regards
>>>>>>> Sandeep
>>>>>>> 
>>>>>>> Regards,
>>>>>>> Sandeep
>>>>>>> 
>>>>>>> On Wed, Oct 28, 2015 at 11:36 AM, Amol Kekre <am...@datatorrent.com>
>>>>>>> 
>>>>>> wrote:
>>>>>> 
>>>>>>> +1
>>>>>>>> 
>>>>>>>> Amol
>>>>>>>> 
>>>>>>>> On Tue, Oct 27, 2015 at 10:27 PM, Bhupesh Chawda <
>>>>>>>> 
>>>>>>> bhupesh@datatorrent.com>
>>>>>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>> Hi All,
>>>>>>>>> 
>>>>>>>>> Apache Samoa <https://samoa.incubator.apache.org/> is a
>>>>>>>>> 
>>>>>>>> distributed
>>>>> 
>>>>>> streaming machine learning framework that contains a programming
>>>>>>>>> abstraction for distributed streaming machine learning
>>>>>>>>> 
>>>>>>>> algorithms.
>>>> 
>>>>> Apache
>>>>>>> 
>>>>>>>> SAMOA enables development of new ML algorithms without directly
>>>>>>>>> 
>>>>>>>> dealing
>>>>>> 
>>>>>>> with the complexity of underlying distributed stream processing
>>>>>>>>> 
>>>>>>>> engines
>>>>>> 
>>>>>>> (DSPEe, such as Apache Storm, Apache S4, and Apache Samza).
>>>>>>>>> 
>>>>>>>> Apache
>>>> 
>>>>> SAMOA
>>>>>>> 
>>>>>>>> users can develop distributed streaming ML algorithms once and
>>>>>>>>> 
>>>>>>>> execute
>>>>>> 
>>>>>>> them
>>>>>>>> 
>>>>>>>>> on multiple DSPEs.
>>>>>>>>> 
>>>>>>>>> Apache Samoa currently has integrations with Apache Storm, Apache
>>>>>>>>> 
>>>>>>>> Flink,
>>>>>>> 
>>>>>>>> Apache S4 and Apache Samza. This means the ML algorithms
>>>>>>>>> 
>>>>>>>> developed
>>>> 
>>>>> on
>>>>> 
>>>>>> Apache Samoa can run on these platforms without any change in the
>>>>>>>>> algorithms.
>>>>>>>>> It would be a good idea to integrate Apache Apex as a distributed
>>>>>>>>> 
>>>>>>>> stream
>>>>>>> 
>>>>>>>> processing engine (DSPE) into Apache Samoa which would allow
>>>>>>>>> 
>>>>>>>> users
>>>> 
>>>>> to
>>>>> 
>>>>>> run
>>>>>>> 
>>>>>>>> ML algorithms developed in Samoa on Apache Apex.
>>>>>>>>> 
>>>>>>>>> Here is the Apex JIRA for integration work:
>>>>>>>>> https://malhar.atlassian.net/browse/APEX-202
>>>>>>>>> Also, here is the JIRA in SAMOA project:
>>>>>>>>> https://issues.apache.org/jira/browse/SAMOA-49
>>>>>>>>> 
>>>>>>>>> Thanks.
>>>>>>>>> 
>>>>>>>>> 
>> 


Re: Integration with Apache Samoa

Posted by Bhupesh Chawda <bh...@datatorrent.com>.
Hi All,

I am facing an issue with Kryo where some classes (which are a part of
operators in Dag) in the imported jars do not have a zero-argument
constructor. This results in a KryoException.

Any suggestions on how to handle this?
In general how can we deal with classes which do not have default
constructors, without modifying them?

Thanks.
-Bhupesh

On Thu, Oct 29, 2015 at 10:10 PM, Vlad Rozov <v....@datatorrent.com>
wrote:

> +1
>
>
> On 10/29/15 08:47, Pramod Immaneni wrote:
>
>> +1
>>
>> On Thu, Oct 29, 2015 at 8:34 AM, Amol Kekre <am...@datatorrent.com> wrote:
>>
>> Samoa can be used to test iteration support as that feature gets
>>> developed.
>>>
>>> Amol
>>>
>>>
>>> On Thu, Oct 29, 2015 at 5:12 AM, Bhupesh Chawda <bhupesh@datatorrent.com
>>> >
>>> wrote:
>>>
>>> Yes, iteration support will be needed for quite a few algorithms.
>>>>
>>>> Thanks.
>>>> Bhupesh
>>>>
>>>> On Wed, Oct 28, 2015 at 7:20 PM, Sandesh Hegde <sandesh@datatorrent.com
>>>> >
>>>> wrote:
>>>>
>>>> Does it need iteration support?  Good idea to discuss this feature in
>>>>>
>>>> both
>>>>
>>>>> the mailing list together.
>>>>>
>>>>> Adding Samoa mailing list.
>>>>>
>>>>> On Wed, Oct 28, 2015, 4:28 AM Sandeep Deshmukh <
>>>>>
>>>> sandeep@datatorrent.com>
>>>
>>>> wrote:
>>>>>
>>>>> +1
>>>>>>
>>>>>> Regards
>>>>>> Sandeep
>>>>>>
>>>>>> Regards,
>>>>>> Sandeep
>>>>>>
>>>>>> On Wed, Oct 28, 2015 at 11:36 AM, Amol Kekre <am...@datatorrent.com>
>>>>>>
>>>>> wrote:
>>>>>
>>>>>> +1
>>>>>>>
>>>>>>> Amol
>>>>>>>
>>>>>>> On Tue, Oct 27, 2015 at 10:27 PM, Bhupesh Chawda <
>>>>>>>
>>>>>> bhupesh@datatorrent.com>
>>>>>>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Hi All,
>>>>>>>>
>>>>>>>> Apache Samoa <https://samoa.incubator.apache.org/> is a
>>>>>>>>
>>>>>>> distributed
>>>>
>>>>> streaming machine learning framework that contains a programming
>>>>>>>> abstraction for distributed streaming machine learning
>>>>>>>>
>>>>>>> algorithms.
>>>
>>>> Apache
>>>>>>
>>>>>>> SAMOA enables development of new ML algorithms without directly
>>>>>>>>
>>>>>>> dealing
>>>>>
>>>>>> with the complexity of underlying distributed stream processing
>>>>>>>>
>>>>>>> engines
>>>>>
>>>>>> (DSPEe, such as Apache Storm, Apache S4, and Apache Samza).
>>>>>>>>
>>>>>>> Apache
>>>
>>>> SAMOA
>>>>>>
>>>>>>> users can develop distributed streaming ML algorithms once and
>>>>>>>>
>>>>>>> execute
>>>>>
>>>>>> them
>>>>>>>
>>>>>>>> on multiple DSPEs.
>>>>>>>>
>>>>>>>> Apache Samoa currently has integrations with Apache Storm, Apache
>>>>>>>>
>>>>>>> Flink,
>>>>>>
>>>>>>> Apache S4 and Apache Samza. This means the ML algorithms
>>>>>>>>
>>>>>>> developed
>>>
>>>> on
>>>>
>>>>> Apache Samoa can run on these platforms without any change in the
>>>>>>>> algorithms.
>>>>>>>> It would be a good idea to integrate Apache Apex as a distributed
>>>>>>>>
>>>>>>> stream
>>>>>>
>>>>>>> processing engine (DSPE) into Apache Samoa which would allow
>>>>>>>>
>>>>>>> users
>>>
>>>> to
>>>>
>>>>> run
>>>>>>
>>>>>>> ML algorithms developed in Samoa on Apache Apex.
>>>>>>>>
>>>>>>>> Here is the Apex JIRA for integration work:
>>>>>>>> https://malhar.atlassian.net/browse/APEX-202
>>>>>>>> Also, here is the JIRA in SAMOA project:
>>>>>>>> https://issues.apache.org/jira/browse/SAMOA-49
>>>>>>>>
>>>>>>>> Thanks.
>>>>>>>>
>>>>>>>>
>

Re: Integration with Apache Samoa

Posted by Vlad Rozov <v....@datatorrent.com>.
+1

On 10/29/15 08:47, Pramod Immaneni wrote:
> +1
>
> On Thu, Oct 29, 2015 at 8:34 AM, Amol Kekre <am...@datatorrent.com> wrote:
>
>> Samoa can be used to test iteration support as that feature gets developed.
>>
>> Amol
>>
>>
>> On Thu, Oct 29, 2015 at 5:12 AM, Bhupesh Chawda <bh...@datatorrent.com>
>> wrote:
>>
>>> Yes, iteration support will be needed for quite a few algorithms.
>>>
>>> Thanks.
>>> Bhupesh
>>>
>>> On Wed, Oct 28, 2015 at 7:20 PM, Sandesh Hegde <sa...@datatorrent.com>
>>> wrote:
>>>
>>>> Does it need iteration support?  Good idea to discuss this feature in
>>> both
>>>> the mailing list together.
>>>>
>>>> Adding Samoa mailing list.
>>>>
>>>> On Wed, Oct 28, 2015, 4:28 AM Sandeep Deshmukh <
>> sandeep@datatorrent.com>
>>>> wrote:
>>>>
>>>>> +1
>>>>>
>>>>> Regards
>>>>> Sandeep
>>>>>
>>>>> Regards,
>>>>> Sandeep
>>>>>
>>>>> On Wed, Oct 28, 2015 at 11:36 AM, Amol Kekre <am...@datatorrent.com>
>>>> wrote:
>>>>>> +1
>>>>>>
>>>>>> Amol
>>>>>>
>>>>>> On Tue, Oct 27, 2015 at 10:27 PM, Bhupesh Chawda <
>>>>> bhupesh@datatorrent.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi All,
>>>>>>>
>>>>>>> Apache Samoa <https://samoa.incubator.apache.org/> is a
>>> distributed
>>>>>>> streaming machine learning framework that contains a programming
>>>>>>> abstraction for distributed streaming machine learning
>> algorithms.
>>>>> Apache
>>>>>>> SAMOA enables development of new ML algorithms without directly
>>>> dealing
>>>>>>> with the complexity of underlying distributed stream processing
>>>> engines
>>>>>>> (DSPEe, such as Apache Storm, Apache S4, and Apache Samza).
>> Apache
>>>>> SAMOA
>>>>>>> users can develop distributed streaming ML algorithms once and
>>>> execute
>>>>>> them
>>>>>>> on multiple DSPEs.
>>>>>>>
>>>>>>> Apache Samoa currently has integrations with Apache Storm, Apache
>>>>> Flink,
>>>>>>> Apache S4 and Apache Samza. This means the ML algorithms
>> developed
>>> on
>>>>>>> Apache Samoa can run on these platforms without any change in the
>>>>>>> algorithms.
>>>>>>> It would be a good idea to integrate Apache Apex as a distributed
>>>>> stream
>>>>>>> processing engine (DSPE) into Apache Samoa which would allow
>> users
>>> to
>>>>> run
>>>>>>> ML algorithms developed in Samoa on Apache Apex.
>>>>>>>
>>>>>>> Here is the Apex JIRA for integration work:
>>>>>>> https://malhar.atlassian.net/browse/APEX-202
>>>>>>> Also, here is the JIRA in SAMOA project:
>>>>>>> https://issues.apache.org/jira/browse/SAMOA-49
>>>>>>>
>>>>>>> Thanks.
>>>>>>>


Re: Integration with Apache Samoa

Posted by Pramod Immaneni <pr...@datatorrent.com>.
+1

On Thu, Oct 29, 2015 at 8:34 AM, Amol Kekre <am...@datatorrent.com> wrote:

> Samoa can be used to test iteration support as that feature gets developed.
>
> Amol
>
>
> On Thu, Oct 29, 2015 at 5:12 AM, Bhupesh Chawda <bh...@datatorrent.com>
> wrote:
>
> > Yes, iteration support will be needed for quite a few algorithms.
> >
> > Thanks.
> > Bhupesh
> >
> > On Wed, Oct 28, 2015 at 7:20 PM, Sandesh Hegde <sa...@datatorrent.com>
> > wrote:
> >
> > > Does it need iteration support?  Good idea to discuss this feature in
> > both
> > > the mailing list together.
> > >
> > > Adding Samoa mailing list.
> > >
> > > On Wed, Oct 28, 2015, 4:28 AM Sandeep Deshmukh <
> sandeep@datatorrent.com>
> > > wrote:
> > >
> > > > +1
> > > >
> > > > Regards
> > > > Sandeep
> > > >
> > > > Regards,
> > > > Sandeep
> > > >
> > > > On Wed, Oct 28, 2015 at 11:36 AM, Amol Kekre <am...@datatorrent.com>
> > > wrote:
> > > >
> > > > > +1
> > > > >
> > > > > Amol
> > > > >
> > > > > On Tue, Oct 27, 2015 at 10:27 PM, Bhupesh Chawda <
> > > > bhupesh@datatorrent.com>
> > > > > wrote:
> > > > >
> > > > > > Hi All,
> > > > > >
> > > > > > Apache Samoa <https://samoa.incubator.apache.org/> is a
> > distributed
> > > > > > streaming machine learning framework that contains a programming
> > > > > > abstraction for distributed streaming machine learning
> algorithms.
> > > > Apache
> > > > > > SAMOA enables development of new ML algorithms without directly
> > > dealing
> > > > > > with the complexity of underlying distributed stream processing
> > > engines
> > > > > > (DSPEe, such as Apache Storm, Apache S4, and Apache Samza).
> Apache
> > > > SAMOA
> > > > > > users can develop distributed streaming ML algorithms once and
> > > execute
> > > > > them
> > > > > > on multiple DSPEs.
> > > > > >
> > > > > > Apache Samoa currently has integrations with Apache Storm, Apache
> > > > Flink,
> > > > > > Apache S4 and Apache Samza. This means the ML algorithms
> developed
> > on
> > > > > > Apache Samoa can run on these platforms without any change in the
> > > > > > algorithms.
> > > > > > It would be a good idea to integrate Apache Apex as a distributed
> > > > stream
> > > > > > processing engine (DSPE) into Apache Samoa which would allow
> users
> > to
> > > > run
> > > > > > ML algorithms developed in Samoa on Apache Apex.
> > > > > >
> > > > > > Here is the Apex JIRA for integration work:
> > > > > > https://malhar.atlassian.net/browse/APEX-202
> > > > > > Also, here is the JIRA in SAMOA project:
> > > > > > https://issues.apache.org/jira/browse/SAMOA-49
> > > > > >
> > > > > > Thanks.
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Integration with Apache Samoa

Posted by Amol Kekre <am...@datatorrent.com>.
Samoa can be used to test iteration support as that feature gets developed.

Amol


On Thu, Oct 29, 2015 at 5:12 AM, Bhupesh Chawda <bh...@datatorrent.com>
wrote:

> Yes, iteration support will be needed for quite a few algorithms.
>
> Thanks.
> Bhupesh
>
> On Wed, Oct 28, 2015 at 7:20 PM, Sandesh Hegde <sa...@datatorrent.com>
> wrote:
>
> > Does it need iteration support?  Good idea to discuss this feature in
> both
> > the mailing list together.
> >
> > Adding Samoa mailing list.
> >
> > On Wed, Oct 28, 2015, 4:28 AM Sandeep Deshmukh <sa...@datatorrent.com>
> > wrote:
> >
> > > +1
> > >
> > > Regards
> > > Sandeep
> > >
> > > Regards,
> > > Sandeep
> > >
> > > On Wed, Oct 28, 2015 at 11:36 AM, Amol Kekre <am...@datatorrent.com>
> > wrote:
> > >
> > > > +1
> > > >
> > > > Amol
> > > >
> > > > On Tue, Oct 27, 2015 at 10:27 PM, Bhupesh Chawda <
> > > bhupesh@datatorrent.com>
> > > > wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > Apache Samoa <https://samoa.incubator.apache.org/> is a
> distributed
> > > > > streaming machine learning framework that contains a programming
> > > > > abstraction for distributed streaming machine learning algorithms.
> > > Apache
> > > > > SAMOA enables development of new ML algorithms without directly
> > dealing
> > > > > with the complexity of underlying distributed stream processing
> > engines
> > > > > (DSPEe, such as Apache Storm, Apache S4, and Apache Samza). Apache
> > > SAMOA
> > > > > users can develop distributed streaming ML algorithms once and
> > execute
> > > > them
> > > > > on multiple DSPEs.
> > > > >
> > > > > Apache Samoa currently has integrations with Apache Storm, Apache
> > > Flink,
> > > > > Apache S4 and Apache Samza. This means the ML algorithms developed
> on
> > > > > Apache Samoa can run on these platforms without any change in the
> > > > > algorithms.
> > > > > It would be a good idea to integrate Apache Apex as a distributed
> > > stream
> > > > > processing engine (DSPE) into Apache Samoa which would allow users
> to
> > > run
> > > > > ML algorithms developed in Samoa on Apache Apex.
> > > > >
> > > > > Here is the Apex JIRA for integration work:
> > > > > https://malhar.atlassian.net/browse/APEX-202
> > > > > Also, here is the JIRA in SAMOA project:
> > > > > https://issues.apache.org/jira/browse/SAMOA-49
> > > > >
> > > > > Thanks.
> > > > >
> > > >
> > >
> >
>

Re: Integration with Apache Samoa

Posted by Amol Kekre <am...@datatorrent.com>.
Samoa can be used to test iteration support as that feature gets developed.

Amol


On Thu, Oct 29, 2015 at 5:12 AM, Bhupesh Chawda <bh...@datatorrent.com>
wrote:

> Yes, iteration support will be needed for quite a few algorithms.
>
> Thanks.
> Bhupesh
>
> On Wed, Oct 28, 2015 at 7:20 PM, Sandesh Hegde <sa...@datatorrent.com>
> wrote:
>
> > Does it need iteration support?  Good idea to discuss this feature in
> both
> > the mailing list together.
> >
> > Adding Samoa mailing list.
> >
> > On Wed, Oct 28, 2015, 4:28 AM Sandeep Deshmukh <sa...@datatorrent.com>
> > wrote:
> >
> > > +1
> > >
> > > Regards
> > > Sandeep
> > >
> > > Regards,
> > > Sandeep
> > >
> > > On Wed, Oct 28, 2015 at 11:36 AM, Amol Kekre <am...@datatorrent.com>
> > wrote:
> > >
> > > > +1
> > > >
> > > > Amol
> > > >
> > > > On Tue, Oct 27, 2015 at 10:27 PM, Bhupesh Chawda <
> > > bhupesh@datatorrent.com>
> > > > wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > Apache Samoa <https://samoa.incubator.apache.org/> is a
> distributed
> > > > > streaming machine learning framework that contains a programming
> > > > > abstraction for distributed streaming machine learning algorithms.
> > > Apache
> > > > > SAMOA enables development of new ML algorithms without directly
> > dealing
> > > > > with the complexity of underlying distributed stream processing
> > engines
> > > > > (DSPEe, such as Apache Storm, Apache S4, and Apache Samza). Apache
> > > SAMOA
> > > > > users can develop distributed streaming ML algorithms once and
> > execute
> > > > them
> > > > > on multiple DSPEs.
> > > > >
> > > > > Apache Samoa currently has integrations with Apache Storm, Apache
> > > Flink,
> > > > > Apache S4 and Apache Samza. This means the ML algorithms developed
> on
> > > > > Apache Samoa can run on these platforms without any change in the
> > > > > algorithms.
> > > > > It would be a good idea to integrate Apache Apex as a distributed
> > > stream
> > > > > processing engine (DSPE) into Apache Samoa which would allow users
> to
> > > run
> > > > > ML algorithms developed in Samoa on Apache Apex.
> > > > >
> > > > > Here is the Apex JIRA for integration work:
> > > > > https://malhar.atlassian.net/browse/APEX-202
> > > > > Also, here is the JIRA in SAMOA project:
> > > > > https://issues.apache.org/jira/browse/SAMOA-49
> > > > >
> > > > > Thanks.
> > > > >
> > > >
> > >
> >
>

Re: Integration with Apache Samoa

Posted by Bhupesh Chawda <bh...@datatorrent.com>.
Yes, iteration support will be needed for quite a few algorithms.

Thanks.
Bhupesh

On Wed, Oct 28, 2015 at 7:20 PM, Sandesh Hegde <sa...@datatorrent.com>
wrote:

> Does it need iteration support?  Good idea to discuss this feature in both
> the mailing list together.
>
> Adding Samoa mailing list.
>
> On Wed, Oct 28, 2015, 4:28 AM Sandeep Deshmukh <sa...@datatorrent.com>
> wrote:
>
> > +1
> >
> > Regards
> > Sandeep
> >
> > Regards,
> > Sandeep
> >
> > On Wed, Oct 28, 2015 at 11:36 AM, Amol Kekre <am...@datatorrent.com>
> wrote:
> >
> > > +1
> > >
> > > Amol
> > >
> > > On Tue, Oct 27, 2015 at 10:27 PM, Bhupesh Chawda <
> > bhupesh@datatorrent.com>
> > > wrote:
> > >
> > > > Hi All,
> > > >
> > > > Apache Samoa <https://samoa.incubator.apache.org/> is a distributed
> > > > streaming machine learning framework that contains a programming
> > > > abstraction for distributed streaming machine learning algorithms.
> > Apache
> > > > SAMOA enables development of new ML algorithms without directly
> dealing
> > > > with the complexity of underlying distributed stream processing
> engines
> > > > (DSPEe, such as Apache Storm, Apache S4, and Apache Samza). Apache
> > SAMOA
> > > > users can develop distributed streaming ML algorithms once and
> execute
> > > them
> > > > on multiple DSPEs.
> > > >
> > > > Apache Samoa currently has integrations with Apache Storm, Apache
> > Flink,
> > > > Apache S4 and Apache Samza. This means the ML algorithms developed on
> > > > Apache Samoa can run on these platforms without any change in the
> > > > algorithms.
> > > > It would be a good idea to integrate Apache Apex as a distributed
> > stream
> > > > processing engine (DSPE) into Apache Samoa which would allow users to
> > run
> > > > ML algorithms developed in Samoa on Apache Apex.
> > > >
> > > > Here is the Apex JIRA for integration work:
> > > > https://malhar.atlassian.net/browse/APEX-202
> > > > Also, here is the JIRA in SAMOA project:
> > > > https://issues.apache.org/jira/browse/SAMOA-49
> > > >
> > > > Thanks.
> > > >
> > >
> >
>

Re: Integration with Apache Samoa

Posted by Bhupesh Chawda <bh...@datatorrent.com>.
Yes, iteration support will be needed for quite a few algorithms.

Thanks.
Bhupesh

On Wed, Oct 28, 2015 at 7:20 PM, Sandesh Hegde <sa...@datatorrent.com>
wrote:

> Does it need iteration support?  Good idea to discuss this feature in both
> the mailing list together.
>
> Adding Samoa mailing list.
>
> On Wed, Oct 28, 2015, 4:28 AM Sandeep Deshmukh <sa...@datatorrent.com>
> wrote:
>
> > +1
> >
> > Regards
> > Sandeep
> >
> > Regards,
> > Sandeep
> >
> > On Wed, Oct 28, 2015 at 11:36 AM, Amol Kekre <am...@datatorrent.com>
> wrote:
> >
> > > +1
> > >
> > > Amol
> > >
> > > On Tue, Oct 27, 2015 at 10:27 PM, Bhupesh Chawda <
> > bhupesh@datatorrent.com>
> > > wrote:
> > >
> > > > Hi All,
> > > >
> > > > Apache Samoa <https://samoa.incubator.apache.org/> is a distributed
> > > > streaming machine learning framework that contains a programming
> > > > abstraction for distributed streaming machine learning algorithms.
> > Apache
> > > > SAMOA enables development of new ML algorithms without directly
> dealing
> > > > with the complexity of underlying distributed stream processing
> engines
> > > > (DSPEe, such as Apache Storm, Apache S4, and Apache Samza). Apache
> > SAMOA
> > > > users can develop distributed streaming ML algorithms once and
> execute
> > > them
> > > > on multiple DSPEs.
> > > >
> > > > Apache Samoa currently has integrations with Apache Storm, Apache
> > Flink,
> > > > Apache S4 and Apache Samza. This means the ML algorithms developed on
> > > > Apache Samoa can run on these platforms without any change in the
> > > > algorithms.
> > > > It would be a good idea to integrate Apache Apex as a distributed
> > stream
> > > > processing engine (DSPE) into Apache Samoa which would allow users to
> > run
> > > > ML algorithms developed in Samoa on Apache Apex.
> > > >
> > > > Here is the Apex JIRA for integration work:
> > > > https://malhar.atlassian.net/browse/APEX-202
> > > > Also, here is the JIRA in SAMOA project:
> > > > https://issues.apache.org/jira/browse/SAMOA-49
> > > >
> > > > Thanks.
> > > >
> > >
> >
>