You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apex.apache.org by Bhupesh Chawda <bh...@datatorrent.com> on 2016/11/07 09:25:30 UTC

Re: Integration with Apache Samoa

Hi All,

The PR for making Apex a runner for SAMOA has been merged.

Apache SAMOA now has an additional runner with Apache Apex -
https://github.com/apache/incubator-samoa/tree/master/samoa-apex

Thanks.

~ Bhupesh

On Mon, Mar 28, 2016 at 11:05 AM, Bhupesh Chawda <bh...@datatorrent.com>
wrote:

> Hi All,
>
> Here is the status of the integration, since the last update:
>
>    - Launch process cleaned up. Launch still happening through DTCli
>    calls. Once APEXCORE-405
>    <https://issues.apache.org/jira/browse/APEXCORE-405> is implemented,
>    this will become a lot better.
>    - Some issues still causing problems in running Samoa apps on Apex.
>    Containers getting killed. This may be due to low memory. This is work in
>    progress.
>
> ~Bhupesh
>
> On Wed, Mar 2, 2016 at 10:14 AM, Bhupesh Chawda <bh...@datatorrent.com>
> wrote:
>
>> Hi David,
>>
>> Here is the working branch you can look at:
>> https://github.com/bhupeshchawda/incubator-samoa/tree/samoa-apex
>>
>> As I mentioned, the launch part needs to be worked on. Currently I have a
>> few hacks in my local environment.
>> You can use the test cases though to get an idea.
>>
>> -Bhupesh
>>
>> On Wed, Mar 2, 2016 at 1:01 AM, David Yan <da...@datatorrent.com> wrote:
>>
>>> Hi Bhupesh,
>>>
>>> That's good progress.  Can you send us a link to the code you did for
>>> this?  Or maybe a review-only PR?
>>>
>>> David
>>>
>>> On Mon, Feb 29, 2016 at 10:15 PM, Bhupesh Chawda <
>>> bhupesh@datatorrent.com>
>>> wrote:
>>>
>>> > Hi All,
>>> >
>>> > Here is the status of integration of Apache Apex into Apache Samoa.
>>> >
>>> >    - Samoa API implemented and able to convert Samoa topology into Apex
>>> > Dag.
>>> >    - Implemented partitioning support using parallelism hints from
>>> Samoa
>>> >    API.
>>> >    - Implemented stream multiplexing:
>>> >       - Added All-based partitioner. Upstream tuples go to all
>>> downstream
>>> >       partitions
>>> >       - Stream codec for Key based partitioning
>>> >       - Stream codec for Random partitioning
>>> >    - Able to launch a Samoa task on the cluster. This has to be worked
>>> on.
>>> >    Currently some hacks are used by calling DTCli explicitly from the
>>> main
>>> >    entry point in Samoa code. Also jars are needed to be manually
>>> bundled.
>>> >    This will be worked on in this sprint.
>>> >    - Tested the following algorithms on local cluster:
>>> >       - Prequential Evaluation using Vertical Hoeffding Tree
>>> classifier.
>>> >       This is a decision tree based classifier.
>>> >       - Clustering using CluStream algorithm.
>>> >    - I have asked clarifications on some more details of these
>>> algorithms
>>> >    as well as serialization issues with Samoa classes. I am waiting for
>>> > some
>>> >    response from the Samoa community.
>>> >    - Although Samoa does not have many algorithms currently (it is a
>>> >    framework for developing algorithms), more algorithms are expected
>>> as a
>>> >    part of their roadmap:
>>> >    https://cwiki.apache.org/confluence/display/SAMOA/Roadmap
>>> >
>>> > Thanks,
>>> >
>>> > Bhupesh
>>> >
>>>
>>
>>
>

Re: Integration with Apache Samoa

Posted by Mohit Jotwani <mo...@datatorrent.com>.
Finally! Congratulations!!

Very well done....

Regards,
Mohit

On Mon, Nov 7, 2016 at 2:55 PM, Bhupesh Chawda <bh...@datatorrent.com>
wrote:

> Hi All,
>
> The PR for making Apex a runner for SAMOA has been merged.
>
> Apache SAMOA now has an additional runner with Apache Apex -
> https://github.com/apache/incubator-samoa/tree/master/samoa-apex
>
> Thanks.
>
> ~ Bhupesh
>
> On Mon, Mar 28, 2016 at 11:05 AM, Bhupesh Chawda <bh...@datatorrent.com>
> wrote:
>
> > Hi All,
> >
> > Here is the status of the integration, since the last update:
> >
> >    - Launch process cleaned up. Launch still happening through DTCli
> >    calls. Once APEXCORE-405
> >    <https://issues.apache.org/jira/browse/APEXCORE-405> is implemented,
> >    this will become a lot better.
> >    - Some issues still causing problems in running Samoa apps on Apex.
> >    Containers getting killed. This may be due to low memory. This is
> work in
> >    progress.
> >
> > ~Bhupesh
> >
> > On Wed, Mar 2, 2016 at 10:14 AM, Bhupesh Chawda <bhupesh@datatorrent.com
> >
> > wrote:
> >
> >> Hi David,
> >>
> >> Here is the working branch you can look at:
> >> https://github.com/bhupeshchawda/incubator-samoa/tree/samoa-apex
> >>
> >> As I mentioned, the launch part needs to be worked on. Currently I have
> a
> >> few hacks in my local environment.
> >> You can use the test cases though to get an idea.
> >>
> >> -Bhupesh
> >>
> >> On Wed, Mar 2, 2016 at 1:01 AM, David Yan <da...@datatorrent.com>
> wrote:
> >>
> >>> Hi Bhupesh,
> >>>
> >>> That's good progress.  Can you send us a link to the code you did for
> >>> this?  Or maybe a review-only PR?
> >>>
> >>> David
> >>>
> >>> On Mon, Feb 29, 2016 at 10:15 PM, Bhupesh Chawda <
> >>> bhupesh@datatorrent.com>
> >>> wrote:
> >>>
> >>> > Hi All,
> >>> >
> >>> > Here is the status of integration of Apache Apex into Apache Samoa.
> >>> >
> >>> >    - Samoa API implemented and able to convert Samoa topology into
> Apex
> >>> > Dag.
> >>> >    - Implemented partitioning support using parallelism hints from
> >>> Samoa
> >>> >    API.
> >>> >    - Implemented stream multiplexing:
> >>> >       - Added All-based partitioner. Upstream tuples go to all
> >>> downstream
> >>> >       partitions
> >>> >       - Stream codec for Key based partitioning
> >>> >       - Stream codec for Random partitioning
> >>> >    - Able to launch a Samoa task on the cluster. This has to be
> worked
> >>> on.
> >>> >    Currently some hacks are used by calling DTCli explicitly from the
> >>> main
> >>> >    entry point in Samoa code. Also jars are needed to be manually
> >>> bundled.
> >>> >    This will be worked on in this sprint.
> >>> >    - Tested the following algorithms on local cluster:
> >>> >       - Prequential Evaluation using Vertical Hoeffding Tree
> >>> classifier.
> >>> >       This is a decision tree based classifier.
> >>> >       - Clustering using CluStream algorithm.
> >>> >    - I have asked clarifications on some more details of these
> >>> algorithms
> >>> >    as well as serialization issues with Samoa classes. I am waiting
> for
> >>> > some
> >>> >    response from the Samoa community.
> >>> >    - Although Samoa does not have many algorithms currently (it is a
> >>> >    framework for developing algorithms), more algorithms are expected
> >>> as a
> >>> >    part of their roadmap:
> >>> >    https://cwiki.apache.org/confluence/display/SAMOA/Roadmap
> >>> >
> >>> > Thanks,
> >>> >
> >>> > Bhupesh
> >>> >
> >>>
> >>
> >>
> >
>

Re: Integration with Apache Samoa

Posted by Sanjay Pujare <sa...@datatorrent.com>.
Yes, great job and impressive.

On 11/7/16, 11:23 AM, "Sandesh Hegde" <sa...@datatorrent.com> wrote:

    Good work Bhupesh.
    
    On Mon, Nov 7, 2016 at 11:17 AM David Yan <da...@datatorrent.com> wrote:
    
    > It took perseverance to get this merged, Good work Bhupesh!
    >
    > On Mon, Nov 7, 2016 at 1:25 AM, Bhupesh Chawda <bh...@datatorrent.com>
    > wrote:
    >
    > > Hi All,
    > >
    > > The PR for making Apex a runner for SAMOA has been merged.
    > >
    > > Apache SAMOA now has an additional runner with Apache Apex -
    > > https://github.com/apache/incubator-samoa/tree/master/samoa-apex
    > >
    > > Thanks.
    > >
    > > ~ Bhupesh
    > >
    > > On Mon, Mar 28, 2016 at 11:05 AM, Bhupesh Chawda <
    > bhupesh@datatorrent.com>
    > > wrote:
    > >
    > > > Hi All,
    > > >
    > > > Here is the status of the integration, since the last update:
    > > >
    > > >    - Launch process cleaned up. Launch still happening through DTCli
    > > >    calls. Once APEXCORE-405
    > > >    <https://issues.apache.org/jira/browse/APEXCORE-405> is
    > implemented,
    > > >    this will become a lot better.
    > > >    - Some issues still causing problems in running Samoa apps on Apex.
    > > >    Containers getting killed. This may be due to low memory. This is
    > > work in
    > > >    progress.
    > > >
    > > > ~Bhupesh
    > > >
    > > > On Wed, Mar 2, 2016 at 10:14 AM, Bhupesh Chawda <
    > bhupesh@datatorrent.com
    > > >
    > > > wrote:
    > > >
    > > >> Hi David,
    > > >>
    > > >> Here is the working branch you can look at:
    > > >> https://github.com/bhupeshchawda/incubator-samoa/tree/samoa-apex
    > > >>
    > > >> As I mentioned, the launch part needs to be worked on. Currently I
    > have
    > > a
    > > >> few hacks in my local environment.
    > > >> You can use the test cases though to get an idea.
    > > >>
    > > >> -Bhupesh
    > > >>
    > > >> On Wed, Mar 2, 2016 at 1:01 AM, David Yan <da...@datatorrent.com>
    > > wrote:
    > > >>
    > > >>> Hi Bhupesh,
    > > >>>
    > > >>> That's good progress.  Can you send us a link to the code you did for
    > > >>> this?  Or maybe a review-only PR?
    > > >>>
    > > >>> David
    > > >>>
    > > >>> On Mon, Feb 29, 2016 at 10:15 PM, Bhupesh Chawda <
    > > >>> bhupesh@datatorrent.com>
    > > >>> wrote:
    > > >>>
    > > >>> > Hi All,
    > > >>> >
    > > >>> > Here is the status of integration of Apache Apex into Apache Samoa.
    > > >>> >
    > > >>> >    - Samoa API implemented and able to convert Samoa topology into
    > > Apex
    > > >>> > Dag.
    > > >>> >    - Implemented partitioning support using parallelism hints from
    > > >>> Samoa
    > > >>> >    API.
    > > >>> >    - Implemented stream multiplexing:
    > > >>> >       - Added All-based partitioner. Upstream tuples go to all
    > > >>> downstream
    > > >>> >       partitions
    > > >>> >       - Stream codec for Key based partitioning
    > > >>> >       - Stream codec for Random partitioning
    > > >>> >    - Able to launch a Samoa task on the cluster. This has to be
    > > worked
    > > >>> on.
    > > >>> >    Currently some hacks are used by calling DTCli explicitly from
    > the
    > > >>> main
    > > >>> >    entry point in Samoa code. Also jars are needed to be manually
    > > >>> bundled.
    > > >>> >    This will be worked on in this sprint.
    > > >>> >    - Tested the following algorithms on local cluster:
    > > >>> >       - Prequential Evaluation using Vertical Hoeffding Tree
    > > >>> classifier.
    > > >>> >       This is a decision tree based classifier.
    > > >>> >       - Clustering using CluStream algorithm.
    > > >>> >    - I have asked clarifications on some more details of these
    > > >>> algorithms
    > > >>> >    as well as serialization issues with Samoa classes. I am waiting
    > > for
    > > >>> > some
    > > >>> >    response from the Samoa community.
    > > >>> >    - Although Samoa does not have many algorithms currently (it is
    > a
    > > >>> >    framework for developing algorithms), more algorithms are
    > expected
    > > >>> as a
    > > >>> >    part of their roadmap:
    > > >>> >    https://cwiki.apache.org/confluence/display/SAMOA/Roadmap
    > > >>> >
    > > >>> > Thanks,
    > > >>> >
    > > >>> > Bhupesh
    > > >>> >
    > > >>>
    > > >>
    > > >>
    > > >
    > >
    >
    



Re: Integration with Apache Samoa

Posted by Sandesh Hegde <sa...@datatorrent.com>.
Good work Bhupesh.

On Mon, Nov 7, 2016 at 11:17 AM David Yan <da...@datatorrent.com> wrote:

> It took perseverance to get this merged, Good work Bhupesh!
>
> On Mon, Nov 7, 2016 at 1:25 AM, Bhupesh Chawda <bh...@datatorrent.com>
> wrote:
>
> > Hi All,
> >
> > The PR for making Apex a runner for SAMOA has been merged.
> >
> > Apache SAMOA now has an additional runner with Apache Apex -
> > https://github.com/apache/incubator-samoa/tree/master/samoa-apex
> >
> > Thanks.
> >
> > ~ Bhupesh
> >
> > On Mon, Mar 28, 2016 at 11:05 AM, Bhupesh Chawda <
> bhupesh@datatorrent.com>
> > wrote:
> >
> > > Hi All,
> > >
> > > Here is the status of the integration, since the last update:
> > >
> > >    - Launch process cleaned up. Launch still happening through DTCli
> > >    calls. Once APEXCORE-405
> > >    <https://issues.apache.org/jira/browse/APEXCORE-405> is
> implemented,
> > >    this will become a lot better.
> > >    - Some issues still causing problems in running Samoa apps on Apex.
> > >    Containers getting killed. This may be due to low memory. This is
> > work in
> > >    progress.
> > >
> > > ~Bhupesh
> > >
> > > On Wed, Mar 2, 2016 at 10:14 AM, Bhupesh Chawda <
> bhupesh@datatorrent.com
> > >
> > > wrote:
> > >
> > >> Hi David,
> > >>
> > >> Here is the working branch you can look at:
> > >> https://github.com/bhupeshchawda/incubator-samoa/tree/samoa-apex
> > >>
> > >> As I mentioned, the launch part needs to be worked on. Currently I
> have
> > a
> > >> few hacks in my local environment.
> > >> You can use the test cases though to get an idea.
> > >>
> > >> -Bhupesh
> > >>
> > >> On Wed, Mar 2, 2016 at 1:01 AM, David Yan <da...@datatorrent.com>
> > wrote:
> > >>
> > >>> Hi Bhupesh,
> > >>>
> > >>> That's good progress.  Can you send us a link to the code you did for
> > >>> this?  Or maybe a review-only PR?
> > >>>
> > >>> David
> > >>>
> > >>> On Mon, Feb 29, 2016 at 10:15 PM, Bhupesh Chawda <
> > >>> bhupesh@datatorrent.com>
> > >>> wrote:
> > >>>
> > >>> > Hi All,
> > >>> >
> > >>> > Here is the status of integration of Apache Apex into Apache Samoa.
> > >>> >
> > >>> >    - Samoa API implemented and able to convert Samoa topology into
> > Apex
> > >>> > Dag.
> > >>> >    - Implemented partitioning support using parallelism hints from
> > >>> Samoa
> > >>> >    API.
> > >>> >    - Implemented stream multiplexing:
> > >>> >       - Added All-based partitioner. Upstream tuples go to all
> > >>> downstream
> > >>> >       partitions
> > >>> >       - Stream codec for Key based partitioning
> > >>> >       - Stream codec for Random partitioning
> > >>> >    - Able to launch a Samoa task on the cluster. This has to be
> > worked
> > >>> on.
> > >>> >    Currently some hacks are used by calling DTCli explicitly from
> the
> > >>> main
> > >>> >    entry point in Samoa code. Also jars are needed to be manually
> > >>> bundled.
> > >>> >    This will be worked on in this sprint.
> > >>> >    - Tested the following algorithms on local cluster:
> > >>> >       - Prequential Evaluation using Vertical Hoeffding Tree
> > >>> classifier.
> > >>> >       This is a decision tree based classifier.
> > >>> >       - Clustering using CluStream algorithm.
> > >>> >    - I have asked clarifications on some more details of these
> > >>> algorithms
> > >>> >    as well as serialization issues with Samoa classes. I am waiting
> > for
> > >>> > some
> > >>> >    response from the Samoa community.
> > >>> >    - Although Samoa does not have many algorithms currently (it is
> a
> > >>> >    framework for developing algorithms), more algorithms are
> expected
> > >>> as a
> > >>> >    part of their roadmap:
> > >>> >    https://cwiki.apache.org/confluence/display/SAMOA/Roadmap
> > >>> >
> > >>> > Thanks,
> > >>> >
> > >>> > Bhupesh
> > >>> >
> > >>>
> > >>
> > >>
> > >
> >
>

Re: Integration with Apache Samoa

Posted by David Yan <da...@datatorrent.com>.
It took perseverance to get this merged, Good work Bhupesh!

On Mon, Nov 7, 2016 at 1:25 AM, Bhupesh Chawda <bh...@datatorrent.com>
wrote:

> Hi All,
>
> The PR for making Apex a runner for SAMOA has been merged.
>
> Apache SAMOA now has an additional runner with Apache Apex -
> https://github.com/apache/incubator-samoa/tree/master/samoa-apex
>
> Thanks.
>
> ~ Bhupesh
>
> On Mon, Mar 28, 2016 at 11:05 AM, Bhupesh Chawda <bh...@datatorrent.com>
> wrote:
>
> > Hi All,
> >
> > Here is the status of the integration, since the last update:
> >
> >    - Launch process cleaned up. Launch still happening through DTCli
> >    calls. Once APEXCORE-405
> >    <https://issues.apache.org/jira/browse/APEXCORE-405> is implemented,
> >    this will become a lot better.
> >    - Some issues still causing problems in running Samoa apps on Apex.
> >    Containers getting killed. This may be due to low memory. This is
> work in
> >    progress.
> >
> > ~Bhupesh
> >
> > On Wed, Mar 2, 2016 at 10:14 AM, Bhupesh Chawda <bhupesh@datatorrent.com
> >
> > wrote:
> >
> >> Hi David,
> >>
> >> Here is the working branch you can look at:
> >> https://github.com/bhupeshchawda/incubator-samoa/tree/samoa-apex
> >>
> >> As I mentioned, the launch part needs to be worked on. Currently I have
> a
> >> few hacks in my local environment.
> >> You can use the test cases though to get an idea.
> >>
> >> -Bhupesh
> >>
> >> On Wed, Mar 2, 2016 at 1:01 AM, David Yan <da...@datatorrent.com>
> wrote:
> >>
> >>> Hi Bhupesh,
> >>>
> >>> That's good progress.  Can you send us a link to the code you did for
> >>> this?  Or maybe a review-only PR?
> >>>
> >>> David
> >>>
> >>> On Mon, Feb 29, 2016 at 10:15 PM, Bhupesh Chawda <
> >>> bhupesh@datatorrent.com>
> >>> wrote:
> >>>
> >>> > Hi All,
> >>> >
> >>> > Here is the status of integration of Apache Apex into Apache Samoa.
> >>> >
> >>> >    - Samoa API implemented and able to convert Samoa topology into
> Apex
> >>> > Dag.
> >>> >    - Implemented partitioning support using parallelism hints from
> >>> Samoa
> >>> >    API.
> >>> >    - Implemented stream multiplexing:
> >>> >       - Added All-based partitioner. Upstream tuples go to all
> >>> downstream
> >>> >       partitions
> >>> >       - Stream codec for Key based partitioning
> >>> >       - Stream codec for Random partitioning
> >>> >    - Able to launch a Samoa task on the cluster. This has to be
> worked
> >>> on.
> >>> >    Currently some hacks are used by calling DTCli explicitly from the
> >>> main
> >>> >    entry point in Samoa code. Also jars are needed to be manually
> >>> bundled.
> >>> >    This will be worked on in this sprint.
> >>> >    - Tested the following algorithms on local cluster:
> >>> >       - Prequential Evaluation using Vertical Hoeffding Tree
> >>> classifier.
> >>> >       This is a decision tree based classifier.
> >>> >       - Clustering using CluStream algorithm.
> >>> >    - I have asked clarifications on some more details of these
> >>> algorithms
> >>> >    as well as serialization issues with Samoa classes. I am waiting
> for
> >>> > some
> >>> >    response from the Samoa community.
> >>> >    - Although Samoa does not have many algorithms currently (it is a
> >>> >    framework for developing algorithms), more algorithms are expected
> >>> as a
> >>> >    part of their roadmap:
> >>> >    https://cwiki.apache.org/confluence/display/SAMOA/Roadmap
> >>> >
> >>> > Thanks,
> >>> >
> >>> > Bhupesh
> >>> >
> >>>
> >>
> >>
> >
>