You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apex.apache.org by tushargosavi <gi...@git.apache.org> on 2016/10/14 10:25:10 UTC

[GitHub] apex-core pull request #410: APEXCORE-408: Ability to schedule Sub-DAG from ...

GitHub user tushargosavi opened a pull request:

    https://github.com/apache/apex-core/pull/410

    APEXCORE-408: Ability to schedule Sub-DAG from running application.

    Pull request for dynamic dag modification through stats listener.  It provides following
    functionality
    
    - StatsListener can access the opearator name for easily detecting which opearator stats are being processed.
    - StatsListener can create a instance of object through which it can submit dag modifications to the engine.
    - StatsListener can return dag changes as a response to engine.
    - PlanModifier is modified to take a DAG and apply it on the existing running DAG and deploy the changes.
    
    The following functionality is not working yet.
    
    - The new opearator does not start from the correct windowId (https://issues.apache.org/jira/browse/APEXCORE-532)
    - Relanched application failed to start when it was killed after dynamic dag modification.
    - There is no support for resuming operator from previous state when they were removed. This could be achived through
      readig state through external storage on setup.
    - persist operator support is not present for newly added streams.
    - Not all parts are covered through tests.
    
    The demo application using the feature is available at
    https://github.com/tushargosavi/apex-dynamic-scheduling
    
    There are two variations of WordCount application. The first variation detects the presence of
    new files start a disconnected DAG to process the data.
    (https://github.com/tushargosavi/apex-dynamic-scheduling/blob/master/src/main/java/com/datatorrent/wordcount/WordCountApp.java)
    
    The second application (https://github.com/tushargosavi/apex-dynamic-scheduling/blob/master/src/main/java/com/datatorrent/wordcount/ExtendApp.java),
    initially only one reader operator is running in the DAG, and provides pendingFiles as auto-metric to stat listener.
    On detecting pending files it attaches splitter counter and output operator to the read operator. Once files are processed the splitter, counter and
    output operators are removed and added back again if new data files are added into the directory.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tushargosavi/incubator-apex-core APEXCORE-408

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/apex-core/pull/410.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #410
    
----
commit 5cecd029463de021469afb20650e3afbf75fd088
Author: Tushar R. Gosavi <tu...@apache.org>
Date:   2016-08-08T10:24:41Z

    APEXCORE-408: Ability to schedule Sub-DAG from running application.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] apex-core pull request #410: APEXCORE-408: Ability to schedule Sub-DAG from ...

Posted by tushargosavi <gi...@git.apache.org>.
Github user tushargosavi closed the pull request at:

    https://github.com/apache/apex-core/pull/410


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---