You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@samza.apache.org by "Shadi A. Noghabi" <ab...@illinois.edu> on 2015/08/08 20:19:52 UTC

Re: Review Request 36006: SAMZA-724: Writing a tool to read from the coordinator stream and react to config changes accordingly.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36006/
-----------------------------------------------------------

(Updated Aug. 8, 2015, 6:19 p.m.)


Review request for samza, Yi Pan (Data Infrastructure), Navina Ramesh, and Naveen Somasundaram.


Summary (updated)
-----------------

SAMZA-724: Writing a tool to read from the coordinator stream and react to config changes accordingly. 


Repository: samza


Description (updated)
-------

This RBis for SAMZA-724.

After a job is submitted, it might need some configuration change, specifically it might need more containers. In SAMZA-704 a tool is being added to write to the coordinator stream (CoordinatorStreamWriter).  This tool can be used to write new configurations to the coordinator stream. However, another tool (ConfigManager) is needed to read the config changes and react to them, which is the goal of this task. This tool should be brought up after the job is submitted and read any config changes added to the coordinator stream, and react to each accordingly. 

This tool, called the Config Manager, is focusing on handling container changs by reacting to set-config massages with key "yarn.container.count". 

The config manager is a separate standa alone module, that should be brought up separately after the submission of a job. Therefore, you have to add two configurations to the input config file:
1. yarn.rm.address= <ip of resource manager in yarn. ex: localhost >
2. yarn.rm.port= <the port of the resource manager http server. ex: 8088 >

The config manger will periodically poll the coordinator stream to see if there are any new messages. This period is set to 100 ms by deafualt. However, it can be configured by adding configManager.polling.interval=<polling interval> to the input config file. Thus, overal the command to run the config manager along with the job would be:


<path to samza deployment>/bin/run-config-manager.sh --config-factory=<config factory> --config-path=<path to config file of a job>


Diffs
-----

  build.gradle 0852adc4e8e0c2816afd1ebf433f1af6b44852f7 
  checkstyle/import-control.xml 24ed680785175f3cdf955602b7a813684edd813e 
  gradle/dependency-versions.gradle fb06e8ed393d1a38abfa1a48fe5244fc7f6c7339 
  samza-autoscaling/src/main/java/org/apache/samza/autoscaling/deployer/ConfigManager.java PRE-CREATION 
  samza-autoscaling/src/main/java/org/apache/samza/autoscaling/utils/YarnUtil.java PRE-CREATION 
  samza-core/src/main/java/org/apache/samza/coordinator/stream/CoordinatorStreamSystemConsumer.java 2277a732b9ab763edf19a0fbec288ff72b27583b 
  samza-core/src/main/scala/org/apache/samza/job/JobRunner.scala d7c928c7401e539a370d4e82276e7dabbce1b638 
  samza-shell/src/main/bash/run-config-manager.sh PRE-CREATION 
  samza-test/src/test/scala/org/apache/samza/test/integration/TestStatefulTask.scala ea702a919348305ff95ce0b4ca1996a13aff04ec 
  samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterService.scala ce88698c12c4bf6f4cf128f92d60b0b9496997d7 
  settings.gradle 19bff971ad221084dac10d3f7f3facfa42b829a7 

Diff: https://reviews.apache.org/r/36006/diff/


Testing
-------

Tested with hello samza and works properly.


Thanks,

Shadi A. Noghabi


Re: Review Request 36006: SAMZA-724: Writing a tool to read from the coordinator stream and react to config changes accordingly.

Posted by "Yi Pan (Data Infrastructure)" <yi...@linkedin.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36006/#review94705
-----------------------------------------------------------

Ship it!


LGTM. Just two nits. Thanks!


samza-autoscaling/src/main/java/org/apache/samza/autoscaling/utils/YarnUtil.java (line 70)
<https://reviews.apache.org/r/36006/#comment149321>

    Question: why is this function only limited to running application instances? It seems maybe getRunningAppId() is more accurate?


- Yi Pan (Data Infrastructure)


On Aug. 8, 2015, 6:19 p.m., Shadi A. Noghabi wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/36006/
> -----------------------------------------------------------
> 
> (Updated Aug. 8, 2015, 6:19 p.m.)
> 
> 
> Review request for samza, Yi Pan (Data Infrastructure), Navina Ramesh, and Naveen Somasundaram.
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> This RBis for SAMZA-724.
> 
> After a job is submitted, it might need some configuration change, specifically it might need more containers. In SAMZA-704 a tool is being added to write to the coordinator stream (CoordinatorStreamWriter).  This tool can be used to write new configurations to the coordinator stream. However, another tool (ConfigManager) is needed to read the config changes and react to them, which is the goal of this task. This tool should be brought up after the job is submitted and read any config changes added to the coordinator stream, and react to each accordingly. 
> 
> This tool, called the Config Manager, is focusing on handling container changs by reacting to set-config massages with key "yarn.container.count". 
> 
> The config manager is a separate standa alone module, that should be brought up separately after the submission of a job. Therefore, you have to add two configurations to the input config file:
> 1. yarn.rm.address= <ip of resource manager in yarn. ex: localhost >
> 2. yarn.rm.port= <the port of the resource manager http server. ex: 8088 >
> 
> The config manger will periodically poll the coordinator stream to see if there are any new messages. This period is set to 100 ms by deafualt. However, it can be configured by adding configManager.polling.interval=<polling interval> to the input config file. Thus, overal the command to run the config manager along with the job would be:
> 
> 
> <path to samza deployment>/bin/run-config-manager.sh --config-factory=<config factory> --config-path=<path to config file of a job>
> 
> 
> Diffs
> -----
> 
>   build.gradle 0852adc4e8e0c2816afd1ebf433f1af6b44852f7 
>   checkstyle/import-control.xml 24ed680785175f3cdf955602b7a813684edd813e 
>   gradle/dependency-versions.gradle fb06e8ed393d1a38abfa1a48fe5244fc7f6c7339 
>   samza-autoscaling/src/main/java/org/apache/samza/autoscaling/deployer/ConfigManager.java PRE-CREATION 
>   samza-autoscaling/src/main/java/org/apache/samza/autoscaling/utils/YarnUtil.java PRE-CREATION 
>   samza-core/src/main/java/org/apache/samza/coordinator/stream/CoordinatorStreamSystemConsumer.java 2277a732b9ab763edf19a0fbec288ff72b27583b 
>   samza-core/src/main/scala/org/apache/samza/job/JobRunner.scala d7c928c7401e539a370d4e82276e7dabbce1b638 
>   samza-shell/src/main/bash/run-config-manager.sh PRE-CREATION 
>   samza-test/src/test/scala/org/apache/samza/test/integration/TestStatefulTask.scala ea702a919348305ff95ce0b4ca1996a13aff04ec 
>   samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterService.scala ce88698c12c4bf6f4cf128f92d60b0b9496997d7 
>   settings.gradle 19bff971ad221084dac10d3f7f3facfa42b829a7 
> 
> Diff: https://reviews.apache.org/r/36006/diff/
> 
> 
> Testing
> -------
> 
> Tested with hello samza and works properly.
> 
> 
> Thanks,
> 
> Shadi A. Noghabi
> 
>


Re: Review Request 36006: SAMZA-724: Writing a tool to read from the coordinator stream and react to config changes accordingly.

Posted by Navina Ramesh <nr...@linkedin.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36006/#review95359
-----------------------------------------------------------

Ship it!


Tried it out. Works as expected. Thanks!

- Navina Ramesh


On Aug. 13, 2015, 11:54 p.m., Shadi A. Noghabi wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/36006/
> -----------------------------------------------------------
> 
> (Updated Aug. 13, 2015, 11:54 p.m.)
> 
> 
> Review request for samza, Yi Pan (Data Infrastructure), Navina Ramesh, and Naveen Somasundaram.
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> This RBis for SAMZA-724.
> 
> After a job is submitted, it might need some configuration change, specifically it might need more containers. In SAMZA-704 a tool is being added to write to the coordinator stream (CoordinatorStreamWriter).  This tool can be used to write new configurations to the coordinator stream. However, another tool (ConfigManager) is needed to read the config changes and react to them, which is the goal of this task. This tool should be brought up after the job is submitted and read any config changes added to the coordinator stream, and react to each accordingly. 
> 
> This tool, called the Config Manager, is focusing on handling container changs by reacting to set-config massages with key "yarn.container.count". 
> 
> The config manager is a separate standa alone module, that should be brought up separately after the submission of a job. Therefore, you have to add two configurations to the input config file:
> 1. yarn.rm.address= <ip of resource manager in yarn. ex: localhost >
> 2. yarn.rm.port= <the port of the resource manager http server. ex: 8088 >
> 
> The config manger will periodically poll the coordinator stream to see if there are any new messages. This period is set to 100 ms by deafualt. However, it can be configured by adding configManager.polling.interval=<polling interval> to the input config file. Thus, overal the command to run the config manager along with the job would be:
> 
> 
> <path to samza deployment>/bin/run-config-manager.sh --config-factory=<config factory> --config-path=<path to config file of a job>
> 
> 
> Diffs
> -----
> 
>   build.gradle a935088eccb3aee4fbde21275fa8e701c215a69e 
>   checkstyle/import-control.xml 24ed680785175f3cdf955602b7a813684edd813e 
>   gradle/dependency-versions.gradle fb06e8ed393d1a38abfa1a48fe5244fc7f6c7339 
>   samza-autoscaling/src/main/java/org/apache/samza/autoscaling/deployer/ConfigManager.java PRE-CREATION 
>   samza-autoscaling/src/main/java/org/apache/samza/autoscaling/utils/YarnUtil.java PRE-CREATION 
>   samza-core/src/main/java/org/apache/samza/coordinator/stream/CoordinatorStreamSystemConsumer.java 2277a732b9ab763edf19a0fbec288ff72b27583b 
>   samza-core/src/main/scala/org/apache/samza/job/JobRunner.scala d7c928c7401e539a370d4e82276e7dabbce1b638 
>   samza-shell/src/main/bash/run-config-manager.sh PRE-CREATION 
>   samza-test/src/test/scala/org/apache/samza/test/integration/StreamTaskTestUtil.scala c6e994ff707af802ded57c3bc1762971892014da 
>   samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterService.scala ce88698c12c4bf6f4cf128f92d60b0b9496997d7 
>   settings.gradle a8d2c885254ca3994327fda18e09c49bc9c5e830 
> 
> Diff: https://reviews.apache.org/r/36006/diff/
> 
> 
> Testing
> -------
> 
> Tested with hello samza and works properly.
> 
> 
> Thanks,
> 
> Shadi A. Noghabi
> 
>


Re: Review Request 36006: SAMZA-724: Writing a tool to read from the coordinator stream and react to config changes accordingly.

Posted by "Shadi A. Noghabi" <ab...@illinois.edu>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36006/
-----------------------------------------------------------

(Updated Aug. 14, 2015, 1:25 a.m.)


Review request for samza, Yi Pan (Data Infrastructure), Navina Ramesh, and Naveen Somasundaram.


Changes
-------

Fixed some java doc nits.


Repository: samza


Description
-------

This RBis for SAMZA-724.

After a job is submitted, it might need some configuration change, specifically it might need more containers. In SAMZA-704 a tool is being added to write to the coordinator stream (CoordinatorStreamWriter).  This tool can be used to write new configurations to the coordinator stream. However, another tool (ConfigManager) is needed to read the config changes and react to them, which is the goal of this task. This tool should be brought up after the job is submitted and read any config changes added to the coordinator stream, and react to each accordingly. 

This tool, called the Config Manager, is focusing on handling container changs by reacting to set-config massages with key "yarn.container.count". 

The config manager is a separate standa alone module, that should be brought up separately after the submission of a job. Therefore, you have to add two configurations to the input config file:
1. yarn.rm.address= <ip of resource manager in yarn. ex: localhost >
2. yarn.rm.port= <the port of the resource manager http server. ex: 8088 >

The config manger will periodically poll the coordinator stream to see if there are any new messages. This period is set to 100 ms by deafualt. However, it can be configured by adding configManager.polling.interval=<polling interval> to the input config file. Thus, overal the command to run the config manager along with the job would be:


<path to samza deployment>/bin/run-config-manager.sh --config-factory=<config factory> --config-path=<path to config file of a job>


Diffs (updated)
-----

  build.gradle a935088eccb3aee4fbde21275fa8e701c215a69e 
  checkstyle/import-control.xml 24ed680785175f3cdf955602b7a813684edd813e 
  gradle/dependency-versions.gradle fb06e8ed393d1a38abfa1a48fe5244fc7f6c7339 
  samza-autoscaling/src/main/java/org/apache/samza/autoscaling/deployer/ConfigManager.java PRE-CREATION 
  samza-autoscaling/src/main/java/org/apache/samza/autoscaling/utils/YarnUtil.java PRE-CREATION 
  samza-core/src/main/java/org/apache/samza/coordinator/stream/CoordinatorStreamSystemConsumer.java 2277a732b9ab763edf19a0fbec288ff72b27583b 
  samza-core/src/main/scala/org/apache/samza/job/JobRunner.scala d7c928c7401e539a370d4e82276e7dabbce1b638 
  samza-shell/src/main/bash/run-config-manager.sh PRE-CREATION 
  samza-test/src/test/scala/org/apache/samza/test/integration/StreamTaskTestUtil.scala c6e994ff707af802ded57c3bc1762971892014da 
  samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterService.scala ce88698c12c4bf6f4cf128f92d60b0b9496997d7 
  settings.gradle a8d2c885254ca3994327fda18e09c49bc9c5e830 

Diff: https://reviews.apache.org/r/36006/diff/


Testing
-------

Tested with hello samza and works properly.


Thanks,

Shadi A. Noghabi


Re: Review Request 36006: SAMZA-724: Writing a tool to read from the coordinator stream and react to config changes accordingly.

Posted by "Shadi A. Noghabi" <ab...@illinois.edu>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36006/
-----------------------------------------------------------

(Updated Aug. 13, 2015, 11:54 p.m.)


Review request for samza, Yi Pan (Data Infrastructure), Navina Ramesh, and Naveen Somasundaram.


Repository: samza


Description
-------

This RBis for SAMZA-724.

After a job is submitted, it might need some configuration change, specifically it might need more containers. In SAMZA-704 a tool is being added to write to the coordinator stream (CoordinatorStreamWriter).  This tool can be used to write new configurations to the coordinator stream. However, another tool (ConfigManager) is needed to read the config changes and react to them, which is the goal of this task. This tool should be brought up after the job is submitted and read any config changes added to the coordinator stream, and react to each accordingly. 

This tool, called the Config Manager, is focusing on handling container changs by reacting to set-config massages with key "yarn.container.count". 

The config manager is a separate standa alone module, that should be brought up separately after the submission of a job. Therefore, you have to add two configurations to the input config file:
1. yarn.rm.address= <ip of resource manager in yarn. ex: localhost >
2. yarn.rm.port= <the port of the resource manager http server. ex: 8088 >

The config manger will periodically poll the coordinator stream to see if there are any new messages. This period is set to 100 ms by deafualt. However, it can be configured by adding configManager.polling.interval=<polling interval> to the input config file. Thus, overal the command to run the config manager along with the job would be:


<path to samza deployment>/bin/run-config-manager.sh --config-factory=<config factory> --config-path=<path to config file of a job>


Diffs (updated)
-----

  build.gradle a935088eccb3aee4fbde21275fa8e701c215a69e 
  checkstyle/import-control.xml 24ed680785175f3cdf955602b7a813684edd813e 
  gradle/dependency-versions.gradle fb06e8ed393d1a38abfa1a48fe5244fc7f6c7339 
  samza-autoscaling/src/main/java/org/apache/samza/autoscaling/deployer/ConfigManager.java PRE-CREATION 
  samza-autoscaling/src/main/java/org/apache/samza/autoscaling/utils/YarnUtil.java PRE-CREATION 
  samza-core/src/main/java/org/apache/samza/coordinator/stream/CoordinatorStreamSystemConsumer.java 2277a732b9ab763edf19a0fbec288ff72b27583b 
  samza-core/src/main/scala/org/apache/samza/job/JobRunner.scala d7c928c7401e539a370d4e82276e7dabbce1b638 
  samza-shell/src/main/bash/run-config-manager.sh PRE-CREATION 
  samza-test/src/test/scala/org/apache/samza/test/integration/StreamTaskTestUtil.scala c6e994ff707af802ded57c3bc1762971892014da 
  samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterService.scala ce88698c12c4bf6f4cf128f92d60b0b9496997d7 
  settings.gradle a8d2c885254ca3994327fda18e09c49bc9c5e830 

Diff: https://reviews.apache.org/r/36006/diff/


Testing
-------

Tested with hello samza and works properly.


Thanks,

Shadi A. Noghabi


Re: Review Request 36006: SAMZA-724: Writing a tool to read from the coordinator stream and react to config changes accordingly.

Posted by Navina Ramesh <nr...@linkedin.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36006/#review94992
-----------------------------------------------------------



samza-autoscaling/src/main/java/org/apache/samza/autoscaling/deployer/ConfigManager.java (line 98)
<https://reviews.apache.org/r/36006/#comment149763>

    The javadoc does not indicate that the polling Interval is configurable.



samza-autoscaling/src/main/java/org/apache/samza/autoscaling/deployer/ConfigManager.java (line 289)
<https://reviews.apache.org/r/36006/#comment149771>

    I was trying out your patch. 1000 seconds (~16 min) seems a litte too much time to wait. I would expect the overall time for wait (including multiple kill attempts) to be no more than 5 minutes. Can you change this hard-coded sleep time and make it configurable?



samza-autoscaling/src/main/java/org/apache/samza/autoscaling/deployer/ConfigManager.java (line 291)
<https://reviews.apache.org/r/36006/#comment149775>

    You are reading the application state only once outside the while loop in Line 286. So, the state never changes when the check happens in line 291. I think you should attempt to kill more than once and check the state each time.
    
    When I was trying out the patch, it killed the job fine. But it never came back up!



samza-autoscaling/src/main/java/org/apache/samza/autoscaling/utils/YarnUtil.java (line 126)
<https://reviews.apache.org/r/36006/#comment149741>

    Can you add a short description for these exceptions here to get rid of these warning?
    
    samza-autoscaling/src/main/java/org/apache/samza/autoscaling/deployer/ConfigManager.java:355: warning: no @param for args
      public static void main(String[] args) {
                         ^
    samza-autoscaling/src/main/java/org/apache/samza/autoscaling/utils/YarnUtil.java:126: warning: no description for @throws
       * @throws IOException
         ^
    samza-autoscaling/src/main/java/org/apache/samza/autoscaling/utils/YarnUtil.java:127: warning: no description for @throws
       * @throws YarnException
         ^


- Navina Ramesh


On Aug. 11, 2015, 6:38 p.m., Shadi A. Noghabi wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/36006/
> -----------------------------------------------------------
> 
> (Updated Aug. 11, 2015, 6:38 p.m.)
> 
> 
> Review request for samza, Yi Pan (Data Infrastructure), Navina Ramesh, and Naveen Somasundaram.
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> This RBis for SAMZA-724.
> 
> After a job is submitted, it might need some configuration change, specifically it might need more containers. In SAMZA-704 a tool is being added to write to the coordinator stream (CoordinatorStreamWriter).  This tool can be used to write new configurations to the coordinator stream. However, another tool (ConfigManager) is needed to read the config changes and react to them, which is the goal of this task. This tool should be brought up after the job is submitted and read any config changes added to the coordinator stream, and react to each accordingly. 
> 
> This tool, called the Config Manager, is focusing on handling container changs by reacting to set-config massages with key "yarn.container.count". 
> 
> The config manager is a separate standa alone module, that should be brought up separately after the submission of a job. Therefore, you have to add two configurations to the input config file:
> 1. yarn.rm.address= <ip of resource manager in yarn. ex: localhost >
> 2. yarn.rm.port= <the port of the resource manager http server. ex: 8088 >
> 
> The config manger will periodically poll the coordinator stream to see if there are any new messages. This period is set to 100 ms by deafualt. However, it can be configured by adding configManager.polling.interval=<polling interval> to the input config file. Thus, overal the command to run the config manager along with the job would be:
> 
> 
> <path to samza deployment>/bin/run-config-manager.sh --config-factory=<config factory> --config-path=<path to config file of a job>
> 
> 
> Diffs
> -----
> 
>   build.gradle a935088eccb3aee4fbde21275fa8e701c215a69e 
>   checkstyle/import-control.xml 24ed680785175f3cdf955602b7a813684edd813e 
>   gradle/dependency-versions.gradle fb06e8ed393d1a38abfa1a48fe5244fc7f6c7339 
>   samza-autoscaling/src/main/java/org/apache/samza/autoscaling/deployer/ConfigManager.java PRE-CREATION 
>   samza-autoscaling/src/main/java/org/apache/samza/autoscaling/utils/YarnUtil.java PRE-CREATION 
>   samza-core/src/main/java/org/apache/samza/coordinator/stream/CoordinatorStreamSystemConsumer.java 2277a732b9ab763edf19a0fbec288ff72b27583b 
>   samza-core/src/main/scala/org/apache/samza/job/JobRunner.scala d7c928c7401e539a370d4e82276e7dabbce1b638 
>   samza-shell/src/main/bash/run-config-manager.sh PRE-CREATION 
>   samza-test/src/test/scala/org/apache/samza/test/integration/StreamTaskTestUtil.scala c6e994ff707af802ded57c3bc1762971892014da 
>   samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterService.scala ce88698c12c4bf6f4cf128f92d60b0b9496997d7 
>   settings.gradle a8d2c885254ca3994327fda18e09c49bc9c5e830 
> 
> Diff: https://reviews.apache.org/r/36006/diff/
> 
> 
> Testing
> -------
> 
> Tested with hello samza and works properly.
> 
> 
> Thanks,
> 
> Shadi A. Noghabi
> 
>


Re: Review Request 36006: SAMZA-724: Writing a tool to read from the coordinator stream and react to config changes accordingly.

Posted by "Shadi A. Noghabi" <ab...@illinois.edu>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36006/
-----------------------------------------------------------

(Updated Aug. 11, 2015, 6:38 p.m.)


Review request for samza, Yi Pan (Data Infrastructure), Navina Ramesh, and Naveen Somasundaram.


Repository: samza


Description
-------

This RBis for SAMZA-724.

After a job is submitted, it might need some configuration change, specifically it might need more containers. In SAMZA-704 a tool is being added to write to the coordinator stream (CoordinatorStreamWriter).  This tool can be used to write new configurations to the coordinator stream. However, another tool (ConfigManager) is needed to read the config changes and react to them, which is the goal of this task. This tool should be brought up after the job is submitted and read any config changes added to the coordinator stream, and react to each accordingly. 

This tool, called the Config Manager, is focusing on handling container changs by reacting to set-config massages with key "yarn.container.count". 

The config manager is a separate standa alone module, that should be brought up separately after the submission of a job. Therefore, you have to add two configurations to the input config file:
1. yarn.rm.address= <ip of resource manager in yarn. ex: localhost >
2. yarn.rm.port= <the port of the resource manager http server. ex: 8088 >

The config manger will periodically poll the coordinator stream to see if there are any new messages. This period is set to 100 ms by deafualt. However, it can be configured by adding configManager.polling.interval=<polling interval> to the input config file. Thus, overal the command to run the config manager along with the job would be:


<path to samza deployment>/bin/run-config-manager.sh --config-factory=<config factory> --config-path=<path to config file of a job>


Diffs (updated)
-----

  build.gradle a935088eccb3aee4fbde21275fa8e701c215a69e 
  checkstyle/import-control.xml 24ed680785175f3cdf955602b7a813684edd813e 
  gradle/dependency-versions.gradle fb06e8ed393d1a38abfa1a48fe5244fc7f6c7339 
  samza-autoscaling/src/main/java/org/apache/samza/autoscaling/deployer/ConfigManager.java PRE-CREATION 
  samza-autoscaling/src/main/java/org/apache/samza/autoscaling/utils/YarnUtil.java PRE-CREATION 
  samza-core/src/main/java/org/apache/samza/coordinator/stream/CoordinatorStreamSystemConsumer.java 2277a732b9ab763edf19a0fbec288ff72b27583b 
  samza-core/src/main/scala/org/apache/samza/job/JobRunner.scala d7c928c7401e539a370d4e82276e7dabbce1b638 
  samza-shell/src/main/bash/run-config-manager.sh PRE-CREATION 
  samza-test/src/test/scala/org/apache/samza/test/integration/StreamTaskTestUtil.scala c6e994ff707af802ded57c3bc1762971892014da 
  samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterService.scala ce88698c12c4bf6f4cf128f92d60b0b9496997d7 
  settings.gradle a8d2c885254ca3994327fda18e09c49bc9c5e830 

Diff: https://reviews.apache.org/r/36006/diff/


Testing
-------

Tested with hello samza and works properly.


Thanks,

Shadi A. Noghabi


Re: Review Request 36006: SAMZA-724: Writing a tool to read from the coordinator stream and react to config changes accordingly.

Posted by "Shadi A. Noghabi" <ab...@illinois.edu>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36006/
-----------------------------------------------------------

(Updated Aug. 10, 2015, 6:35 p.m.)


Review request for samza, Yi Pan (Data Infrastructure), Navina Ramesh, and Naveen Somasundaram.


Repository: samza


Description
-------

This RBis for SAMZA-724.

After a job is submitted, it might need some configuration change, specifically it might need more containers. In SAMZA-704 a tool is being added to write to the coordinator stream (CoordinatorStreamWriter).  This tool can be used to write new configurations to the coordinator stream. However, another tool (ConfigManager) is needed to read the config changes and react to them, which is the goal of this task. This tool should be brought up after the job is submitted and read any config changes added to the coordinator stream, and react to each accordingly. 

This tool, called the Config Manager, is focusing on handling container changs by reacting to set-config massages with key "yarn.container.count". 

The config manager is a separate standa alone module, that should be brought up separately after the submission of a job. Therefore, you have to add two configurations to the input config file:
1. yarn.rm.address= <ip of resource manager in yarn. ex: localhost >
2. yarn.rm.port= <the port of the resource manager http server. ex: 8088 >

The config manger will periodically poll the coordinator stream to see if there are any new messages. This period is set to 100 ms by deafualt. However, it can be configured by adding configManager.polling.interval=<polling interval> to the input config file. Thus, overal the command to run the config manager along with the job would be:


<path to samza deployment>/bin/run-config-manager.sh --config-factory=<config factory> --config-path=<path to config file of a job>


Diffs (updated)
-----

  build.gradle 0852adc4e8e0c2816afd1ebf433f1af6b44852f7 
  checkstyle/import-control.xml 24ed680785175f3cdf955602b7a813684edd813e 
  gradle/dependency-versions.gradle fb06e8ed393d1a38abfa1a48fe5244fc7f6c7339 
  samza-autoscaling/src/main/java/org/apache/samza/autoscaling/deployer/ConfigManager.java PRE-CREATION 
  samza-autoscaling/src/main/java/org/apache/samza/autoscaling/utils/YarnUtil.java PRE-CREATION 
  samza-core/src/main/java/org/apache/samza/coordinator/stream/CoordinatorStreamSystemConsumer.java 2277a732b9ab763edf19a0fbec288ff72b27583b 
  samza-core/src/main/scala/org/apache/samza/job/JobRunner.scala d7c928c7401e539a370d4e82276e7dabbce1b638 
  samza-shell/src/main/bash/run-config-manager.sh PRE-CREATION 
  samza-test/src/test/scala/org/apache/samza/test/integration/TestStatefulTask.scala ea702a919348305ff95ce0b4ca1996a13aff04ec 
  samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterService.scala ce88698c12c4bf6f4cf128f92d60b0b9496997d7 
  settings.gradle 19bff971ad221084dac10d3f7f3facfa42b829a7 

Diff: https://reviews.apache.org/r/36006/diff/


Testing
-------

Tested with hello samza and works properly.


Thanks,

Shadi A. Noghabi