You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@samza.apache.org by David Chen <dc...@linkedin.com> on 2014/09/11 02:10:47 UTC

Review Request 25522: SAMZA-408: Expose metric for tracking AM availability.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25522/
-----------------------------------------------------------

Review request for samza.


Bugs: SAMZA-408
    https://issues.apache.org/jira/browse/SAMZA-408


Repository: samza


Description
-------

SAMZA-408: Expose metric for tracking AM availability.


Diffs
-----

  samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterLifecycle.scala 3d17632e17d3495a4335a6a80bcdb9e40db9d184 
  samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterMetrics.scala 09b1237d670307d8c51303bf1086bf863bad4756 
  samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterState.scala 471eff499f9af4f76d434e8b5d79f618a5dcaeb8 
  samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterTaskManager.scala ee08cfbce7ec3079ebf35bb510a22bcc0df1feb1 

Diff: https://reviews.apache.org/r/25522/diff/


Testing
-------


Thanks,

David Chen


Re: Review Request 25522: SAMZA-408: Expose metric for tracking AM availability.

Posted by David Chen <dc...@linkedin.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25522/
-----------------------------------------------------------

(Updated Sept. 11, 2014, 6:45 p.m.)


Review request for samza.


Changes
-------

Handle case where container is restarted due to node failure.


Bugs: SAMZA-408
    https://issues.apache.org/jira/browse/SAMZA-408


Repository: samza


Description
-------

SAMZA-408: Expose metric for tracking AM availability.


Diffs (updated)
-----

  samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterLifecycle.scala 3d17632e17d3495a4335a6a80bcdb9e40db9d184 
  samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterMetrics.scala 09b1237d670307d8c51303bf1086bf863bad4756 
  samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterState.scala 471eff499f9af4f76d434e8b5d79f618a5dcaeb8 
  samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterTaskManager.scala ee08cfbce7ec3079ebf35bb510a22bcc0df1feb1 
  samza-yarn/src/test/scala/org/apache/samza/job/yarn/TestSamzaAppMasterTaskManager.scala 685620fd630fba024d568c6ed7b86c5432d641b2 

Diff: https://reviews.apache.org/r/25522/diff/


Testing
-------

Unit tests pass.


Thanks,

David Chen


Re: Review Request 25522: SAMZA-408: Expose metric for tracking AM availability.

Posted by Chinmay Soman <ch...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25522/#review53001
-----------------------------------------------------------

Ship it!


Ship It!

- Chinmay Soman


On Sept. 11, 2014, 5:06 a.m., David Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25522/
> -----------------------------------------------------------
> 
> (Updated Sept. 11, 2014, 5:06 a.m.)
> 
> 
> Review request for samza.
> 
> 
> Bugs: SAMZA-408
>     https://issues.apache.org/jira/browse/SAMZA-408
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> SAMZA-408: Expose metric for tracking AM availability.
> 
> 
> Diffs
> -----
> 
>   samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterLifecycle.scala 3d17632e17d3495a4335a6a80bcdb9e40db9d184 
>   samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterMetrics.scala 09b1237d670307d8c51303bf1086bf863bad4756 
>   samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterState.scala 471eff499f9af4f76d434e8b5d79f618a5dcaeb8 
>   samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterTaskManager.scala ee08cfbce7ec3079ebf35bb510a22bcc0df1feb1 
>   samza-yarn/src/test/scala/org/apache/samza/job/yarn/TestSamzaAppMasterTaskManager.scala 685620fd630fba024d568c6ed7b86c5432d641b2 
> 
> Diff: https://reviews.apache.org/r/25522/diff/
> 
> 
> Testing
> -------
> 
> Unit tests pass.
> 
> 
> Thanks,
> 
> David Chen
> 
>


Re: Review Request 25522: SAMZA-408: Expose metric for tracking AM availability.

Posted by David Chen <dc...@linkedin.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25522/
-----------------------------------------------------------

(Updated Sept. 11, 2014, 5:06 a.m.)


Review request for samza.


Changes
-------

Handle the case where containers are restarted. Add test coverage.


Bugs: SAMZA-408
    https://issues.apache.org/jira/browse/SAMZA-408


Repository: samza


Description
-------

SAMZA-408: Expose metric for tracking AM availability.


Diffs (updated)
-----

  samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterLifecycle.scala 3d17632e17d3495a4335a6a80bcdb9e40db9d184 
  samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterMetrics.scala 09b1237d670307d8c51303bf1086bf863bad4756 
  samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterState.scala 471eff499f9af4f76d434e8b5d79f618a5dcaeb8 
  samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterTaskManager.scala ee08cfbce7ec3079ebf35bb510a22bcc0df1feb1 
  samza-yarn/src/test/scala/org/apache/samza/job/yarn/TestSamzaAppMasterTaskManager.scala 685620fd630fba024d568c6ed7b86c5432d641b2 

Diff: https://reviews.apache.org/r/25522/diff/


Testing (updated)
-------

Unit tests pass.


Thanks,

David Chen


Re: Review Request 25522: SAMZA-408: Expose metric for tracking AM availability.

Posted by David Chen <dc...@linkedin.com>.

> On Sept. 11, 2014, 12:30 a.m., Chinmay Soman wrote:
> > samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterState.scala, line 45
> > <https://reviews.apache.org/r/25522/diff/1/?file=684858#file684858line45>
> >
> >     If a particular container fails, this will be set to False.
> >     
> >     However, when that container is restarted -> shouldn't we set this back to True ?
> >     
> >     From the current code, it seems like this will remain False after the first incident.

Good point. When a container is allocated and after state.neededContainers is decremented, we should check whether all containers are now running again. If so, then jobHealthy should be set to true again.


- David


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25522/#review52983
-----------------------------------------------------------


On Sept. 11, 2014, 12:19 a.m., David Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25522/
> -----------------------------------------------------------
> 
> (Updated Sept. 11, 2014, 12:19 a.m.)
> 
> 
> Review request for samza.
> 
> 
> Bugs: SAMZA-408
>     https://issues.apache.org/jira/browse/SAMZA-408
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> SAMZA-408: Expose metric for tracking AM availability.
> 
> 
> Diffs
> -----
> 
>   samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterLifecycle.scala 3d17632e17d3495a4335a6a80bcdb9e40db9d184 
>   samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterMetrics.scala 09b1237d670307d8c51303bf1086bf863bad4756 
>   samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterState.scala 471eff499f9af4f76d434e8b5d79f618a5dcaeb8 
>   samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterTaskManager.scala ee08cfbce7ec3079ebf35bb510a22bcc0df1feb1 
> 
> Diff: https://reviews.apache.org/r/25522/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> David Chen
> 
>


Re: Review Request 25522: SAMZA-408: Expose metric for tracking AM availability.

Posted by David Chen <dc...@linkedin.com>.

> On Sept. 11, 2014, 12:30 a.m., Chinmay Soman wrote:
> > samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterState.scala, line 45
> > <https://reviews.apache.org/r/25522/diff/1/?file=684858#file684858line45>
> >
> >     If a particular container fails, this will be set to False.
> >     
> >     However, when that container is restarted -> shouldn't we set this back to True ?
> >     
> >     From the current code, it seems like this will remain False after the first incident.
> 
> David Chen wrote:
>     Good point. When a container is allocated and after state.neededContainers is decremented, we should check whether all containers are now running again. If so, then jobHealthy should be set to true again.

I will also add some code to verify this in the tests and then attach a new patch.


- David


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25522/#review52983
-----------------------------------------------------------


On Sept. 11, 2014, 12:19 a.m., David Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25522/
> -----------------------------------------------------------
> 
> (Updated Sept. 11, 2014, 12:19 a.m.)
> 
> 
> Review request for samza.
> 
> 
> Bugs: SAMZA-408
>     https://issues.apache.org/jira/browse/SAMZA-408
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> SAMZA-408: Expose metric for tracking AM availability.
> 
> 
> Diffs
> -----
> 
>   samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterLifecycle.scala 3d17632e17d3495a4335a6a80bcdb9e40db9d184 
>   samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterMetrics.scala 09b1237d670307d8c51303bf1086bf863bad4756 
>   samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterState.scala 471eff499f9af4f76d434e8b5d79f618a5dcaeb8 
>   samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterTaskManager.scala ee08cfbce7ec3079ebf35bb510a22bcc0df1feb1 
> 
> Diff: https://reviews.apache.org/r/25522/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> David Chen
> 
>


Re: Review Request 25522: SAMZA-408: Expose metric for tracking AM availability.

Posted by Chinmay Soman <ch...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25522/#review52983
-----------------------------------------------------------



samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterState.scala
<https://reviews.apache.org/r/25522/#comment92289>

    If a particular container fails, this will be set to False.
    
    However, when that container is restarted -> shouldn't we set this back to True ?
    
    From the current code, it seems like this will remain False after the first incident.


- Chinmay Soman


On Sept. 11, 2014, 12:19 a.m., David Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25522/
> -----------------------------------------------------------
> 
> (Updated Sept. 11, 2014, 12:19 a.m.)
> 
> 
> Review request for samza.
> 
> 
> Bugs: SAMZA-408
>     https://issues.apache.org/jira/browse/SAMZA-408
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> SAMZA-408: Expose metric for tracking AM availability.
> 
> 
> Diffs
> -----
> 
>   samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterLifecycle.scala 3d17632e17d3495a4335a6a80bcdb9e40db9d184 
>   samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterMetrics.scala 09b1237d670307d8c51303bf1086bf863bad4756 
>   samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterState.scala 471eff499f9af4f76d434e8b5d79f618a5dcaeb8 
>   samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterTaskManager.scala ee08cfbce7ec3079ebf35bb510a22bcc0df1feb1 
> 
> Diff: https://reviews.apache.org/r/25522/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> David Chen
> 
>


Re: Review Request 25522: SAMZA-408: Expose metric for tracking AM availability.

Posted by David Chen <dc...@linkedin.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25522/
-----------------------------------------------------------

(Updated Sept. 11, 2014, 12:19 a.m.)


Review request for samza.


Bugs: SAMZA-408
    https://issues.apache.org/jira/browse/SAMZA-408


Repository: samza


Description
-------

SAMZA-408: Expose metric for tracking AM availability.


Diffs (updated)
-----

  samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterLifecycle.scala 3d17632e17d3495a4335a6a80bcdb9e40db9d184 
  samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterMetrics.scala 09b1237d670307d8c51303bf1086bf863bad4756 
  samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterState.scala 471eff499f9af4f76d434e8b5d79f618a5dcaeb8 
  samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterTaskManager.scala ee08cfbce7ec3079ebf35bb510a22bcc0df1feb1 

Diff: https://reviews.apache.org/r/25522/diff/


Testing
-------


Thanks,

David Chen