You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@samza.apache.org by Navina Ramesh <nr...@linkedin.com> on 2016/01/04 21:14:37 UTC

Re: Review Request 41663: SAMZA-843 : Slow start of Samza jobs with large number of containers

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41663/
-----------------------------------------------------------

(Updated Jan. 4, 2016, 8:14 p.m.)


Review request for samza, Boris Shkolnik, Yan Fang, Chris Riccomini, Jake Maes, Jagadish Venkatraman, Xinyu Liu, and Yi Pan (Data Infrastructure).


Bugs: SAMZA-843
    https://issues.apache.org/jira/browse/SAMZA-843


Repository: samza


Description
-------

SAMZA-843 : Slow start of Samza jobs with large number of containers


Diffs (updated)
-----

  samza-autoscaling/src/main/java/org/apache/samza/autoscaling/deployer/ConfigManager.java 87346bc9f3d92e3ae86ed264c82d12a4d27f3188 
  samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala ddce1481cee41dfb997103dbe3d5df44a123564e 
  samza-core/src/main/scala/org/apache/samza/coordinator/JobCoordinator.scala 112ec1c2935fb6c2ebe9b5d9090263cea94c0cf9 
  samza-core/src/main/scala/org/apache/samza/coordinator/server/JobServlet.scala a3baddbe81c96f58e718e062e964485eaa2dc701 
  samza-core/src/main/scala/org/apache/samza/util/Util.scala 58fbb8f8177a109d35659a29ef6660e239334de2 
  samza-core/src/test/scala/org/apache/samza/container/TestSamzaContainer.scala 365ff0a8ece808a8143ee3580f1f42238dd292d1 
  samza-core/src/test/scala/org/apache/samza/coordinator/TestJobCoordinator.scala 80cccf3104e6e0ce7457303f40baa2f4fa807782 

Diff: https://reviews.apache.org/r/41663/diff/


Testing
-------

Added some additional unit tests
./gradlew clean build


Thanks,

Navina Ramesh


Re: Review Request 41663: SAMZA-843 : Slow start of Samza jobs with large number of containers

Posted by Navina Ramesh <nr...@linkedin.com>.

> On Jan. 4, 2016, 8:45 p.m., Jagadish Venkatraman wrote:
> > samza-core/src/main/scala/org/apache/samza/coordinator/JobCoordinator.scala, line 113
> > <https://reviews.apache.org/r/41663/diff/2/?file=1181044#file1181044line113>
> >
> >     initializeJobModel already sets the jobModelRef and we are setting it again here? Am I missing something obvious? :)

Actually, I think I misread my code change today. We don't need the set jobModelRef in both places. Either one should work. It is slightly confusing because of all the function generator that is returned. I will fix this. Thanks for pointing it out!


- Navina


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41663/#review112640
-----------------------------------------------------------


On Jan. 4, 2016, 8:14 p.m., Navina Ramesh wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/41663/
> -----------------------------------------------------------
> 
> (Updated Jan. 4, 2016, 8:14 p.m.)
> 
> 
> Review request for samza, Boris Shkolnik, Yan Fang, Chris Riccomini, Jake Maes, Jagadish Venkatraman, Xinyu Liu, and Yi Pan (Data Infrastructure).
> 
> 
> Bugs: SAMZA-843
>     https://issues.apache.org/jira/browse/SAMZA-843
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> SAMZA-843 : Slow start of Samza jobs with large number of containers
> 
> 
> Diffs
> -----
> 
>   samza-autoscaling/src/main/java/org/apache/samza/autoscaling/deployer/ConfigManager.java 87346bc9f3d92e3ae86ed264c82d12a4d27f3188 
>   samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala ddce1481cee41dfb997103dbe3d5df44a123564e 
>   samza-core/src/main/scala/org/apache/samza/coordinator/JobCoordinator.scala 112ec1c2935fb6c2ebe9b5d9090263cea94c0cf9 
>   samza-core/src/main/scala/org/apache/samza/coordinator/server/JobServlet.scala a3baddbe81c96f58e718e062e964485eaa2dc701 
>   samza-core/src/main/scala/org/apache/samza/util/Util.scala 58fbb8f8177a109d35659a29ef6660e239334de2 
>   samza-core/src/test/scala/org/apache/samza/container/TestSamzaContainer.scala 365ff0a8ece808a8143ee3580f1f42238dd292d1 
>   samza-core/src/test/scala/org/apache/samza/coordinator/TestJobCoordinator.scala 80cccf3104e6e0ce7457303f40baa2f4fa807782 
> 
> Diff: https://reviews.apache.org/r/41663/diff/
> 
> 
> Testing
> -------
> 
> Added some additional unit tests
> ./gradlew clean build
> 
> 
> Thanks,
> 
> Navina Ramesh
> 
>


Re: Review Request 41663: SAMZA-843 : Slow start of Samza jobs with large number of containers

Posted by Jagadish Venkatraman <ja...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41663/#review112640
-----------------------------------------------------------



samza-autoscaling/src/main/java/org/apache/samza/autoscaling/deployer/ConfigManager.java (line 327)
<https://reviews.apache.org/r/41663/#comment173146>

    Prefer to define a variable here- delayMs.



samza-core/src/main/scala/org/apache/samza/coordinator/JobCoordinator.scala (line 113)
<https://reviews.apache.org/r/41663/#comment173153>

    initializeJobModel already sets the jobModelRef and we are setting it again here? Am I missing something obvious? :)



samza-core/src/main/scala/org/apache/samza/util/Util.scala (line 124)
<https://reviews.apache.org/r/41663/#comment173152>

    +1 on retaining the default read method :-)



samza-core/src/main/scala/org/apache/samza/util/Util.scala (line 133)
<https://reviews.apache.org/r/41663/#comment173151>

    Can you update the @param for retryBackOff too


- Jagadish Venkatraman


On Jan. 4, 2016, 8:14 p.m., Navina Ramesh wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/41663/
> -----------------------------------------------------------
> 
> (Updated Jan. 4, 2016, 8:14 p.m.)
> 
> 
> Review request for samza, Boris Shkolnik, Yan Fang, Chris Riccomini, Jake Maes, Jagadish Venkatraman, Xinyu Liu, and Yi Pan (Data Infrastructure).
> 
> 
> Bugs: SAMZA-843
>     https://issues.apache.org/jira/browse/SAMZA-843
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> SAMZA-843 : Slow start of Samza jobs with large number of containers
> 
> 
> Diffs
> -----
> 
>   samza-autoscaling/src/main/java/org/apache/samza/autoscaling/deployer/ConfigManager.java 87346bc9f3d92e3ae86ed264c82d12a4d27f3188 
>   samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala ddce1481cee41dfb997103dbe3d5df44a123564e 
>   samza-core/src/main/scala/org/apache/samza/coordinator/JobCoordinator.scala 112ec1c2935fb6c2ebe9b5d9090263cea94c0cf9 
>   samza-core/src/main/scala/org/apache/samza/coordinator/server/JobServlet.scala a3baddbe81c96f58e718e062e964485eaa2dc701 
>   samza-core/src/main/scala/org/apache/samza/util/Util.scala 58fbb8f8177a109d35659a29ef6660e239334de2 
>   samza-core/src/test/scala/org/apache/samza/container/TestSamzaContainer.scala 365ff0a8ece808a8143ee3580f1f42238dd292d1 
>   samza-core/src/test/scala/org/apache/samza/coordinator/TestJobCoordinator.scala 80cccf3104e6e0ce7457303f40baa2f4fa807782 
> 
> Diff: https://reviews.apache.org/r/41663/diff/
> 
> 
> Testing
> -------
> 
> Added some additional unit tests
> ./gradlew clean build
> 
> 
> Thanks,
> 
> Navina Ramesh
> 
>


Re: Review Request 41663: SAMZA-843 : Slow start of Samza jobs with large number of containers

Posted by Navina Ramesh <nr...@linkedin.com>.

> On Jan. 4, 2016, 11:11 p.m., Yi Pan (Data Infrastructure) wrote:
> > samza-core/src/main/scala/org/apache/samza/coordinator/JobCoordinator.scala, line 204
> > <https://reviews.apache.org/r/41663/diff/2-3/?file=1181044#file1181044line204>
> >
> >     I am a bit confused here. If the refreshJobModel() function actually returns a different jobModel than what's already set in jobModelRef, isn't the whole point is to replace it w/ the new one?
> 
> Yi Pan (Data Infrastructure) wrote:
>     Ah, I saw it now. Actually, I think that we should keep this line in initializaJobModel() method, but remove the following lines in getCoordinator():
>     {code}
>         val jobModel: JobModel = jobModelGenerator()
>         jobModelRef.set(jobModel)
>     {code}

The thing I noticed is that refreshJobModel and initializeJobModel are both private. The only time when the job model is getting refreshed is when getJobCoordinator is called. Now that I think about it, after the change to the JobServlet, we don't really make a call to refresh the job model. The only other place where we do a jobCoordinator.jobmodel() is in SamzaTaskManager. AT this point, we use the jobModel simply to invoke the readContainerLocality method on it. Technically, nothing refreshes the jobModel after initialization. Better discussed in person :)


- Navina


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41663/#review112677
-----------------------------------------------------------


On Jan. 4, 2016, 10:40 p.m., Navina Ramesh wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/41663/
> -----------------------------------------------------------
> 
> (Updated Jan. 4, 2016, 10:40 p.m.)
> 
> 
> Review request for samza, Boris Shkolnik, Yan Fang, Chris Riccomini, Jake Maes, Jagadish Venkatraman, Xinyu Liu, and Yi Pan (Data Infrastructure).
> 
> 
> Bugs: SAMZA-843
>     https://issues.apache.org/jira/browse/SAMZA-843
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> SAMZA-843 : Slow start of Samza jobs with large number of containers
> 
> 
> Diffs
> -----
> 
>   samza-autoscaling/src/main/java/org/apache/samza/autoscaling/deployer/ConfigManager.java 87346bc9f3d92e3ae86ed264c82d12a4d27f3188 
>   samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala ddce1481cee41dfb997103dbe3d5df44a123564e 
>   samza-core/src/main/scala/org/apache/samza/coordinator/JobCoordinator.scala 112ec1c2935fb6c2ebe9b5d9090263cea94c0cf9 
>   samza-core/src/main/scala/org/apache/samza/coordinator/server/JobServlet.scala a3baddbe81c96f58e718e062e964485eaa2dc701 
>   samza-core/src/main/scala/org/apache/samza/util/Util.scala 58fbb8f8177a109d35659a29ef6660e239334de2 
>   samza-core/src/test/scala/org/apache/samza/container/TestSamzaContainer.scala 365ff0a8ece808a8143ee3580f1f42238dd292d1 
>   samza-core/src/test/scala/org/apache/samza/coordinator/TestJobCoordinator.scala 80cccf3104e6e0ce7457303f40baa2f4fa807782 
> 
> Diff: https://reviews.apache.org/r/41663/diff/
> 
> 
> Testing
> -------
> 
> Added some additional unit tests
> ./gradlew clean build
> 
> 
> Thanks,
> 
> Navina Ramesh
> 
>


Re: Review Request 41663: SAMZA-843 : Slow start of Samza jobs with large number of containers

Posted by "Yi Pan (Data Infrastructure)" <yi...@linkedin.com>.

> On Jan. 4, 2016, 11:11 p.m., Yi Pan (Data Infrastructure) wrote:
> > samza-core/src/main/scala/org/apache/samza/coordinator/JobCoordinator.scala, line 204
> > <https://reviews.apache.org/r/41663/diff/2-3/?file=1181044#file1181044line204>
> >
> >     I am a bit confused here. If the refreshJobModel() function actually returns a different jobModel than what's already set in jobModelRef, isn't the whole point is to replace it w/ the new one?

Ah, I saw it now. Actually, I think that we should keep this line in initializaJobModel() method, but remove the following lines in getCoordinator():
{code}
    val jobModel: JobModel = jobModelGenerator()
    jobModelRef.set(jobModel)
{code}


- Yi


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41663/#review112677
-----------------------------------------------------------


On Jan. 4, 2016, 10:40 p.m., Navina Ramesh wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/41663/
> -----------------------------------------------------------
> 
> (Updated Jan. 4, 2016, 10:40 p.m.)
> 
> 
> Review request for samza, Boris Shkolnik, Yan Fang, Chris Riccomini, Jake Maes, Jagadish Venkatraman, Xinyu Liu, and Yi Pan (Data Infrastructure).
> 
> 
> Bugs: SAMZA-843
>     https://issues.apache.org/jira/browse/SAMZA-843
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> SAMZA-843 : Slow start of Samza jobs with large number of containers
> 
> 
> Diffs
> -----
> 
>   samza-autoscaling/src/main/java/org/apache/samza/autoscaling/deployer/ConfigManager.java 87346bc9f3d92e3ae86ed264c82d12a4d27f3188 
>   samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala ddce1481cee41dfb997103dbe3d5df44a123564e 
>   samza-core/src/main/scala/org/apache/samza/coordinator/JobCoordinator.scala 112ec1c2935fb6c2ebe9b5d9090263cea94c0cf9 
>   samza-core/src/main/scala/org/apache/samza/coordinator/server/JobServlet.scala a3baddbe81c96f58e718e062e964485eaa2dc701 
>   samza-core/src/main/scala/org/apache/samza/util/Util.scala 58fbb8f8177a109d35659a29ef6660e239334de2 
>   samza-core/src/test/scala/org/apache/samza/container/TestSamzaContainer.scala 365ff0a8ece808a8143ee3580f1f42238dd292d1 
>   samza-core/src/test/scala/org/apache/samza/coordinator/TestJobCoordinator.scala 80cccf3104e6e0ce7457303f40baa2f4fa807782 
> 
> Diff: https://reviews.apache.org/r/41663/diff/
> 
> 
> Testing
> -------
> 
> Added some additional unit tests
> ./gradlew clean build
> 
> 
> Thanks,
> 
> Navina Ramesh
> 
>


Re: Review Request 41663: SAMZA-843 : Slow start of Samza jobs with large number of containers

Posted by "Yi Pan (Data Infrastructure)" <yi...@linkedin.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41663/#review112677
-----------------------------------------------------------



samza-core/src/main/scala/org/apache/samza/coordinator/JobCoordinator.scala 
<https://reviews.apache.org/r/41663/#comment173181>

    I am a bit confused here. If the refreshJobModel() function actually returns a different jobModel than what's already set in jobModelRef, isn't the whole point is to replace it w/ the new one?


- Yi Pan (Data Infrastructure)


On Jan. 4, 2016, 10:40 p.m., Navina Ramesh wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/41663/
> -----------------------------------------------------------
> 
> (Updated Jan. 4, 2016, 10:40 p.m.)
> 
> 
> Review request for samza, Boris Shkolnik, Yan Fang, Chris Riccomini, Jake Maes, Jagadish Venkatraman, Xinyu Liu, and Yi Pan (Data Infrastructure).
> 
> 
> Bugs: SAMZA-843
>     https://issues.apache.org/jira/browse/SAMZA-843
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> SAMZA-843 : Slow start of Samza jobs with large number of containers
> 
> 
> Diffs
> -----
> 
>   samza-autoscaling/src/main/java/org/apache/samza/autoscaling/deployer/ConfigManager.java 87346bc9f3d92e3ae86ed264c82d12a4d27f3188 
>   samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala ddce1481cee41dfb997103dbe3d5df44a123564e 
>   samza-core/src/main/scala/org/apache/samza/coordinator/JobCoordinator.scala 112ec1c2935fb6c2ebe9b5d9090263cea94c0cf9 
>   samza-core/src/main/scala/org/apache/samza/coordinator/server/JobServlet.scala a3baddbe81c96f58e718e062e964485eaa2dc701 
>   samza-core/src/main/scala/org/apache/samza/util/Util.scala 58fbb8f8177a109d35659a29ef6660e239334de2 
>   samza-core/src/test/scala/org/apache/samza/container/TestSamzaContainer.scala 365ff0a8ece808a8143ee3580f1f42238dd292d1 
>   samza-core/src/test/scala/org/apache/samza/coordinator/TestJobCoordinator.scala 80cccf3104e6e0ce7457303f40baa2f4fa807782 
> 
> Diff: https://reviews.apache.org/r/41663/diff/
> 
> 
> Testing
> -------
> 
> Added some additional unit tests
> ./gradlew clean build
> 
> 
> Thanks,
> 
> Navina Ramesh
> 
>


Re: Review Request 41663: SAMZA-843 : Slow start of Samza jobs with large number of containers

Posted by Navina Ramesh <nr...@linkedin.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41663/
-----------------------------------------------------------

(Updated Jan. 5, 2016, 9:50 p.m.)


Review request for samza, Boris Shkolnik, Yan Fang, Chris Riccomini, Jake Maes, Jagadish Venkatraman, Xinyu Liu, and Yi Pan (Data Infrastructure).


Changes
-------

Addressed the changes suggested by Yi.


Bugs: SAMZA-843
    https://issues.apache.org/jira/browse/SAMZA-843


Repository: samza


Description
-------

SAMZA-843 : Slow start of Samza jobs with large number of containers


Diffs (updated)
-----

  samza-autoscaling/src/main/java/org/apache/samza/autoscaling/deployer/ConfigManager.java 87346bc9f3d92e3ae86ed264c82d12a4d27f3188 
  samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala ddce1481cee41dfb997103dbe3d5df44a123564e 
  samza-core/src/main/scala/org/apache/samza/coordinator/JobCoordinator.scala 112ec1c2935fb6c2ebe9b5d9090263cea94c0cf9 
  samza-core/src/main/scala/org/apache/samza/coordinator/server/JobServlet.scala a3baddbe81c96f58e718e062e964485eaa2dc701 
  samza-core/src/main/scala/org/apache/samza/util/Util.scala 58fbb8f8177a109d35659a29ef6660e239334de2 
  samza-core/src/test/scala/org/apache/samza/container/TestSamzaContainer.scala 365ff0a8ece808a8143ee3580f1f42238dd292d1 
  samza-core/src/test/scala/org/apache/samza/coordinator/TestJobCoordinator.scala 80cccf3104e6e0ce7457303f40baa2f4fa807782 

Diff: https://reviews.apache.org/r/41663/diff/


Testing
-------

Added some additional unit tests
./gradlew clean build


Thanks,

Navina Ramesh


Re: Review Request 41663: SAMZA-843 : Slow start of Samza jobs with large number of containers

Posted by "Yi Pan (Data Infrastructure)" <yi...@linkedin.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41663/#review112912
-----------------------------------------------------------

Ship it!


lgtm. Fix the nits and feel free to commit! Thanks!


samza-autoscaling/src/main/java/org/apache/samza/autoscaling/deployer/ConfigManager.java (line 341)
<https://reviews.apache.org/r/41663/#comment173399>

    nit: we should use defaultReadJobModelDelayMs here as well.



samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala (line 104)
<https://reviews.apache.org/r/41663/#comment173400>

    nit: should we use defaultReadJobModelDelayMs instead of 100 here as well?


- Yi Pan (Data Infrastructure)


On Jan. 5, 2016, 1:45 a.m., Navina Ramesh wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/41663/
> -----------------------------------------------------------
> 
> (Updated Jan. 5, 2016, 1:45 a.m.)
> 
> 
> Review request for samza, Boris Shkolnik, Yan Fang, Chris Riccomini, Jake Maes, Jagadish Venkatraman, Xinyu Liu, and Yi Pan (Data Infrastructure).
> 
> 
> Bugs: SAMZA-843
>     https://issues.apache.org/jira/browse/SAMZA-843
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> SAMZA-843 : Slow start of Samza jobs with large number of containers
> 
> 
> Diffs
> -----
> 
>   samza-autoscaling/src/main/java/org/apache/samza/autoscaling/deployer/ConfigManager.java 87346bc9f3d92e3ae86ed264c82d12a4d27f3188 
>   samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala ddce1481cee41dfb997103dbe3d5df44a123564e 
>   samza-core/src/main/scala/org/apache/samza/coordinator/JobCoordinator.scala 112ec1c2935fb6c2ebe9b5d9090263cea94c0cf9 
>   samza-core/src/main/scala/org/apache/samza/coordinator/server/JobServlet.scala a3baddbe81c96f58e718e062e964485eaa2dc701 
>   samza-core/src/main/scala/org/apache/samza/util/Util.scala 58fbb8f8177a109d35659a29ef6660e239334de2 
>   samza-core/src/test/scala/org/apache/samza/container/TestSamzaContainer.scala 365ff0a8ece808a8143ee3580f1f42238dd292d1 
>   samza-core/src/test/scala/org/apache/samza/coordinator/TestJobCoordinator.scala 80cccf3104e6e0ce7457303f40baa2f4fa807782 
> 
> Diff: https://reviews.apache.org/r/41663/diff/
> 
> 
> Testing
> -------
> 
> Added some additional unit tests
> ./gradlew clean build
> 
> 
> Thanks,
> 
> Navina Ramesh
> 
>


Re: Review Request 41663: SAMZA-843 : Slow start of Samza jobs with large number of containers

Posted by Navina Ramesh <nr...@linkedin.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41663/
-----------------------------------------------------------

(Updated Jan. 5, 2016, 1:45 a.m.)


Review request for samza, Boris Shkolnik, Yan Fang, Chris Riccomini, Jake Maes, Jagadish Venkatraman, Xinyu Liu, and Yi Pan (Data Infrastructure).


Changes
-------

Removed white space


Bugs: SAMZA-843
    https://issues.apache.org/jira/browse/SAMZA-843


Repository: samza


Description
-------

SAMZA-843 : Slow start of Samza jobs with large number of containers


Diffs (updated)
-----

  samza-autoscaling/src/main/java/org/apache/samza/autoscaling/deployer/ConfigManager.java 87346bc9f3d92e3ae86ed264c82d12a4d27f3188 
  samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala ddce1481cee41dfb997103dbe3d5df44a123564e 
  samza-core/src/main/scala/org/apache/samza/coordinator/JobCoordinator.scala 112ec1c2935fb6c2ebe9b5d9090263cea94c0cf9 
  samza-core/src/main/scala/org/apache/samza/coordinator/server/JobServlet.scala a3baddbe81c96f58e718e062e964485eaa2dc701 
  samza-core/src/main/scala/org/apache/samza/util/Util.scala 58fbb8f8177a109d35659a29ef6660e239334de2 
  samza-core/src/test/scala/org/apache/samza/container/TestSamzaContainer.scala 365ff0a8ece808a8143ee3580f1f42238dd292d1 
  samza-core/src/test/scala/org/apache/samza/coordinator/TestJobCoordinator.scala 80cccf3104e6e0ce7457303f40baa2f4fa807782 

Diff: https://reviews.apache.org/r/41663/diff/


Testing
-------

Added some additional unit tests
./gradlew clean build


Thanks,

Navina Ramesh


Re: Review Request 41663: SAMZA-843 : Slow start of Samza jobs with large number of containers

Posted by Navina Ramesh <nr...@linkedin.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41663/
-----------------------------------------------------------

(Updated Jan. 5, 2016, 12:29 a.m.)


Review request for samza, Boris Shkolnik, Yan Fang, Chris Riccomini, Jake Maes, Jagadish Venkatraman, Xinyu Liu, and Yi Pan (Data Infrastructure).


Changes
-------

After discussin in-person with Yi, I modified the initializeJobModel to return the JobModel itself, rather than a lambda for jobmodel generation


Bugs: SAMZA-843
    https://issues.apache.org/jira/browse/SAMZA-843


Repository: samza


Description
-------

SAMZA-843 : Slow start of Samza jobs with large number of containers


Diffs (updated)
-----

  samza-autoscaling/src/main/java/org/apache/samza/autoscaling/deployer/ConfigManager.java 87346bc9f3d92e3ae86ed264c82d12a4d27f3188 
  samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala ddce1481cee41dfb997103dbe3d5df44a123564e 
  samza-core/src/main/scala/org/apache/samza/coordinator/JobCoordinator.scala 112ec1c2935fb6c2ebe9b5d9090263cea94c0cf9 
  samza-core/src/main/scala/org/apache/samza/coordinator/server/JobServlet.scala a3baddbe81c96f58e718e062e964485eaa2dc701 
  samza-core/src/main/scala/org/apache/samza/util/Util.scala 58fbb8f8177a109d35659a29ef6660e239334de2 
  samza-core/src/test/scala/org/apache/samza/container/TestSamzaContainer.scala 365ff0a8ece808a8143ee3580f1f42238dd292d1 
  samza-core/src/test/scala/org/apache/samza/coordinator/TestJobCoordinator.scala 80cccf3104e6e0ce7457303f40baa2f4fa807782 

Diff: https://reviews.apache.org/r/41663/diff/


Testing
-------

Added some additional unit tests
./gradlew clean build


Thanks,

Navina Ramesh


Re: Review Request 41663: SAMZA-843 : Slow start of Samza jobs with large number of containers

Posted by "Yi Pan (Data Infrastructure)" <yi...@linkedin.com>.

> On Jan. 4, 2016, 11:14 p.m., Boris Shkolnik wrote:
> > samza-core/src/main/scala/org/apache/samza/coordinator/JobCoordinator.scala, line 117
> > <https://reviews.apache.org/r/41663/diff/3/?file=1181082#file1181082line117>
> >
> >     Where is the change to JobCoordinator constructor? Did I miss it?

The constructor actually does not change here. The way to get the initial arguments change.


- Yi


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41663/#review112678
-----------------------------------------------------------


On Jan. 4, 2016, 10:40 p.m., Navina Ramesh wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/41663/
> -----------------------------------------------------------
> 
> (Updated Jan. 4, 2016, 10:40 p.m.)
> 
> 
> Review request for samza, Boris Shkolnik, Yan Fang, Chris Riccomini, Jake Maes, Jagadish Venkatraman, Xinyu Liu, and Yi Pan (Data Infrastructure).
> 
> 
> Bugs: SAMZA-843
>     https://issues.apache.org/jira/browse/SAMZA-843
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> SAMZA-843 : Slow start of Samza jobs with large number of containers
> 
> 
> Diffs
> -----
> 
>   samza-autoscaling/src/main/java/org/apache/samza/autoscaling/deployer/ConfigManager.java 87346bc9f3d92e3ae86ed264c82d12a4d27f3188 
>   samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala ddce1481cee41dfb997103dbe3d5df44a123564e 
>   samza-core/src/main/scala/org/apache/samza/coordinator/JobCoordinator.scala 112ec1c2935fb6c2ebe9b5d9090263cea94c0cf9 
>   samza-core/src/main/scala/org/apache/samza/coordinator/server/JobServlet.scala a3baddbe81c96f58e718e062e964485eaa2dc701 
>   samza-core/src/main/scala/org/apache/samza/util/Util.scala 58fbb8f8177a109d35659a29ef6660e239334de2 
>   samza-core/src/test/scala/org/apache/samza/container/TestSamzaContainer.scala 365ff0a8ece808a8143ee3580f1f42238dd292d1 
>   samza-core/src/test/scala/org/apache/samza/coordinator/TestJobCoordinator.scala 80cccf3104e6e0ce7457303f40baa2f4fa807782 
> 
> Diff: https://reviews.apache.org/r/41663/diff/
> 
> 
> Testing
> -------
> 
> Added some additional unit tests
> ./gradlew clean build
> 
> 
> Thanks,
> 
> Navina Ramesh
> 
>


Re: Review Request 41663: SAMZA-843 : Slow start of Samza jobs with large number of containers

Posted by Navina Ramesh <nr...@linkedin.com>.

> On Jan. 4, 2016, 11:14 p.m., Boris Shkolnik wrote:
> > samza-autoscaling/src/main/java/org/apache/samza/autoscaling/deployer/ConfigManager.java, line 328
> > <https://reviews.apache.org/r/41663/diff/3/?file=1181080#file1181080line328>
> >
> >     Why do we need defaultReadJobModelDelayMS?
> >     If it is not specified - will still use Rand(100)?

Boris, 
*defaultReadJobModelDelayMs* is hard-coded within the class. I don't think Jagadish meant it to be configurable.


> On Jan. 4, 2016, 11:14 p.m., Boris Shkolnik wrote:
> > samza-core/src/main/scala/org/apache/samza/coordinator/JobCoordinator.scala, line 117
> > <https://reviews.apache.org/r/41663/diff/3/?file=1181082#file1181082line117>
> >
> >     Where is the change to JobCoordinator constructor? Did I miss it?
> 
> Yi Pan (Data Infrastructure) wrote:
>     The constructor actually does not change here. The way to get the initial arguments change.

Yep. Constructor doesn't change. jobModelGenerator() returns a JobModel instance. The constructor is same as before.


- Navina


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41663/#review112678
-----------------------------------------------------------


On Jan. 5, 2016, 12:29 a.m., Navina Ramesh wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/41663/
> -----------------------------------------------------------
> 
> (Updated Jan. 5, 2016, 12:29 a.m.)
> 
> 
> Review request for samza, Boris Shkolnik, Yan Fang, Chris Riccomini, Jake Maes, Jagadish Venkatraman, Xinyu Liu, and Yi Pan (Data Infrastructure).
> 
> 
> Bugs: SAMZA-843
>     https://issues.apache.org/jira/browse/SAMZA-843
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> SAMZA-843 : Slow start of Samza jobs with large number of containers
> 
> 
> Diffs
> -----
> 
>   samza-autoscaling/src/main/java/org/apache/samza/autoscaling/deployer/ConfigManager.java 87346bc9f3d92e3ae86ed264c82d12a4d27f3188 
>   samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala ddce1481cee41dfb997103dbe3d5df44a123564e 
>   samza-core/src/main/scala/org/apache/samza/coordinator/JobCoordinator.scala 112ec1c2935fb6c2ebe9b5d9090263cea94c0cf9 
>   samza-core/src/main/scala/org/apache/samza/coordinator/server/JobServlet.scala a3baddbe81c96f58e718e062e964485eaa2dc701 
>   samza-core/src/main/scala/org/apache/samza/util/Util.scala 58fbb8f8177a109d35659a29ef6660e239334de2 
>   samza-core/src/test/scala/org/apache/samza/container/TestSamzaContainer.scala 365ff0a8ece808a8143ee3580f1f42238dd292d1 
>   samza-core/src/test/scala/org/apache/samza/coordinator/TestJobCoordinator.scala 80cccf3104e6e0ce7457303f40baa2f4fa807782 
> 
> Diff: https://reviews.apache.org/r/41663/diff/
> 
> 
> Testing
> -------
> 
> Added some additional unit tests
> ./gradlew clean build
> 
> 
> Thanks,
> 
> Navina Ramesh
> 
>


Re: Review Request 41663: SAMZA-843 : Slow start of Samza jobs with large number of containers

Posted by Boris Shkolnik <bo...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41663/#review112678
-----------------------------------------------------------



samza-autoscaling/src/main/java/org/apache/samza/autoscaling/deployer/ConfigManager.java (line 328)
<https://reviews.apache.org/r/41663/#comment173182>

    Why do we need defaultReadJobModelDelayMS?
    If it is not specified - will still use Rand(100)?



samza-core/src/main/scala/org/apache/samza/coordinator/JobCoordinator.scala (line 117)
<https://reviews.apache.org/r/41663/#comment173183>

    Where is the change to JobCoordinator constructor? Did I miss it?


- Boris Shkolnik


On Jan. 4, 2016, 10:40 p.m., Navina Ramesh wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/41663/
> -----------------------------------------------------------
> 
> (Updated Jan. 4, 2016, 10:40 p.m.)
> 
> 
> Review request for samza, Boris Shkolnik, Yan Fang, Chris Riccomini, Jake Maes, Jagadish Venkatraman, Xinyu Liu, and Yi Pan (Data Infrastructure).
> 
> 
> Bugs: SAMZA-843
>     https://issues.apache.org/jira/browse/SAMZA-843
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> SAMZA-843 : Slow start of Samza jobs with large number of containers
> 
> 
> Diffs
> -----
> 
>   samza-autoscaling/src/main/java/org/apache/samza/autoscaling/deployer/ConfigManager.java 87346bc9f3d92e3ae86ed264c82d12a4d27f3188 
>   samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala ddce1481cee41dfb997103dbe3d5df44a123564e 
>   samza-core/src/main/scala/org/apache/samza/coordinator/JobCoordinator.scala 112ec1c2935fb6c2ebe9b5d9090263cea94c0cf9 
>   samza-core/src/main/scala/org/apache/samza/coordinator/server/JobServlet.scala a3baddbe81c96f58e718e062e964485eaa2dc701 
>   samza-core/src/main/scala/org/apache/samza/util/Util.scala 58fbb8f8177a109d35659a29ef6660e239334de2 
>   samza-core/src/test/scala/org/apache/samza/container/TestSamzaContainer.scala 365ff0a8ece808a8143ee3580f1f42238dd292d1 
>   samza-core/src/test/scala/org/apache/samza/coordinator/TestJobCoordinator.scala 80cccf3104e6e0ce7457303f40baa2f4fa807782 
> 
> Diff: https://reviews.apache.org/r/41663/diff/
> 
> 
> Testing
> -------
> 
> Added some additional unit tests
> ./gradlew clean build
> 
> 
> Thanks,
> 
> Navina Ramesh
> 
>


Re: Review Request 41663: SAMZA-843 : Slow start of Samza jobs with large number of containers

Posted by Navina Ramesh <nr...@linkedin.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41663/
-----------------------------------------------------------

(Updated Jan. 4, 2016, 10:40 p.m.)


Review request for samza, Boris Shkolnik, Yan Fang, Chris Riccomini, Jake Maes, Jagadish Venkatraman, Xinyu Liu, and Yi Pan (Data Infrastructure).


Bugs: SAMZA-843
    https://issues.apache.org/jira/browse/SAMZA-843


Repository: samza


Description
-------

SAMZA-843 : Slow start of Samza jobs with large number of containers


Diffs (updated)
-----

  samza-autoscaling/src/main/java/org/apache/samza/autoscaling/deployer/ConfigManager.java 87346bc9f3d92e3ae86ed264c82d12a4d27f3188 
  samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala ddce1481cee41dfb997103dbe3d5df44a123564e 
  samza-core/src/main/scala/org/apache/samza/coordinator/JobCoordinator.scala 112ec1c2935fb6c2ebe9b5d9090263cea94c0cf9 
  samza-core/src/main/scala/org/apache/samza/coordinator/server/JobServlet.scala a3baddbe81c96f58e718e062e964485eaa2dc701 
  samza-core/src/main/scala/org/apache/samza/util/Util.scala 58fbb8f8177a109d35659a29ef6660e239334de2 
  samza-core/src/test/scala/org/apache/samza/container/TestSamzaContainer.scala 365ff0a8ece808a8143ee3580f1f42238dd292d1 
  samza-core/src/test/scala/org/apache/samza/coordinator/TestJobCoordinator.scala 80cccf3104e6e0ce7457303f40baa2f4fa807782 

Diff: https://reviews.apache.org/r/41663/diff/


Testing
-------

Added some additional unit tests
./gradlew clean build


Thanks,

Navina Ramesh