You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@aurora.apache.org by Franck Cuny via Review Board <no...@reviews.apache.org> on 2018/02/21 15:59:52 UTC

Review Request 65735: Add GPUs to the resources.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65735/
-----------------------------------------------------------

Review request for Aurora.


Repository: aurora


Description
-------

Summary:
Running `job status` on a job with GPUs fails, because the resource type is not
know. The solution is to add the GPU type to the resources.

Test Plan:
Ran the unit tests and built the client:
```
./dist/aurora_internal.pex job status atla/cortex-m40/devel/aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001
 INFO] Checking status of atla/cortex-m40/devel/aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001
Active tasks (1):
	Task role: cortex-m40, env: devel, name: aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001, instance: 0, status: RUNNING on atla-erc-13-sr1.prod.twttr.net
	  CPU: 22.0 core(s), RAM: 131072 MB, Disk: 40960 MB, Port: http, GPU: 8 GPU(s)
	  assigned ports: {u'http': 31891}
	  failure count: 0 (max 1)
	  events:
	   2018-02-18 10:02:57 PENDING: None
	   2018-02-18 10:02:57 ASSIGNED: None
	   2018-02-18 10:02:58 STARTING: Initializing sandbox.
	   2018-02-18 10:03:00 RUNNING: No health-check defined, task is assumed healthy.
	  metadata:
		  (key: 'package', value: 'aaitken/twml_gpu version:7')
		  (key: 'package', value: 'aaitken/train_resnet version:212')
		  (key: 'package', value: 'aaitken/deps_pex version:62')
Inactive tasks (0):
```

Differential Revision: https://phabricator.twitter.biz/D140046


Diffs
-----

  src/main/python/apache/aurora/config/resource.py 85e1d002e295feb312ba5d6033295705e0c2d2af 
  src/test/python/apache/aurora/config/test_resources.py f43bad725e397d04817ccbf1c76229d572185d99 


Diff: https://reviews.apache.org/r/65735/diff/1/


Testing
-------


Thanks,

Franck Cuny


Re: Review Request 65735: Add GPUs to the resources.

Posted by David McLaughlin <da...@dmclaughlin.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65735/#review197909
-----------------------------------------------------------


Ship it!




Thanks for the patch!


src/test/python/apache/aurora/config/test_resources.py
Lines 27 (patched)
<https://reviews.apache.org/r/65735/#comment278159>

    Nit: should be an int here and elsewhere in the test.


- David McLaughlin


On Feb. 21, 2018, 3:59 p.m., Franck Cuny wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65735/
> -----------------------------------------------------------
> 
> (Updated Feb. 21, 2018, 3:59 p.m.)
> 
> 
> Review request for Aurora.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Summary:
> Running `job status` on a job with GPUs fails, because the resource type is not
> know. The solution is to add the GPU type to the resources.
> 
> Test Plan:
> Ran the unit tests and built the client:
> ```
> ./dist/aurora_internal.pex job status atla/cortex-m40/devel/aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001
>  INFO] Checking status of atla/cortex-m40/devel/aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001
> Active tasks (1):
> 	Task role: cortex-m40, env: devel, name: aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001, instance: 0, status: RUNNING on atla-erc-13-sr1.prod.twttr.net
> 	  CPU: 22.0 core(s), RAM: 131072 MB, Disk: 40960 MB, Port: http, GPU: 8 GPU(s)
> 	  assigned ports: {u'http': 31891}
> 	  failure count: 0 (max 1)
> 	  events:
> 	   2018-02-18 10:02:57 PENDING: None
> 	   2018-02-18 10:02:57 ASSIGNED: None
> 	   2018-02-18 10:02:58 STARTING: Initializing sandbox.
> 	   2018-02-18 10:03:00 RUNNING: No health-check defined, task is assumed healthy.
> 	  metadata:
> 		  (key: 'package', value: 'aaitken/twml_gpu version:7')
> 		  (key: 'package', value: 'aaitken/train_resnet version:212')
> 		  (key: 'package', value: 'aaitken/deps_pex version:62')
> Inactive tasks (0):
> ```
> 
> Differential Revision: https://phabricator.twitter.biz/D140046
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/config/resource.py 85e1d002e295feb312ba5d6033295705e0c2d2af 
>   src/test/python/apache/aurora/config/test_resources.py f43bad725e397d04817ccbf1c76229d572185d99 
> 
> 
> Diff: https://reviews.apache.org/r/65735/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Franck Cuny
> 
>


Re: Review Request 65735: Add GPUs to the resources.

Posted by Aurora ReviewBot <wf...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65735/#review197912
-----------------------------------------------------------


Ship it!




Master (e2ea191) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot retry"

- Aurora ReviewBot


On Feb. 21, 2018, 3:59 p.m., Franck Cuny wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65735/
> -----------------------------------------------------------
> 
> (Updated Feb. 21, 2018, 3:59 p.m.)
> 
> 
> Review request for Aurora.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Summary:
> Running `job status` on a job with GPUs fails, because the resource type is not
> know. The solution is to add the GPU type to the resources.
> 
> Test Plan:
> Ran the unit tests and built the client:
> ```
> ./dist/aurora_internal.pex job status atla/cortex-m40/devel/aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001
>  INFO] Checking status of atla/cortex-m40/devel/aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001
> Active tasks (1):
> 	Task role: cortex-m40, env: devel, name: aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001, instance: 0, status: RUNNING on atla-erc-13-sr1.prod.twttr.net
> 	  CPU: 22.0 core(s), RAM: 131072 MB, Disk: 40960 MB, Port: http, GPU: 8 GPU(s)
> 	  assigned ports: {u'http': 31891}
> 	  failure count: 0 (max 1)
> 	  events:
> 	   2018-02-18 10:02:57 PENDING: None
> 	   2018-02-18 10:02:57 ASSIGNED: None
> 	   2018-02-18 10:02:58 STARTING: Initializing sandbox.
> 	   2018-02-18 10:03:00 RUNNING: No health-check defined, task is assumed healthy.
> 	  metadata:
> 		  (key: 'package', value: 'aaitken/twml_gpu version:7')
> 		  (key: 'package', value: 'aaitken/train_resnet version:212')
> 		  (key: 'package', value: 'aaitken/deps_pex version:62')
> Inactive tasks (0):
> ```
> 
> Differential Revision: https://phabricator.twitter.biz/D140046
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/config/resource.py 85e1d002e295feb312ba5d6033295705e0c2d2af 
>   src/test/python/apache/aurora/config/test_resources.py f43bad725e397d04817ccbf1c76229d572185d99 
> 
> 
> Diff: https://reviews.apache.org/r/65735/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Franck Cuny
> 
>


Re: Review Request 65735: Add GPUs to the resources.

Posted by Aurora ReviewBot <wf...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65735/#review197920
-----------------------------------------------------------



Master (e2ea191) is red with this patch.
  ./build-support/jenkins/build.sh


  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0   122    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0

100   454  100   454    0     0   2868      0 --:--:-- --:--:-- --:--:--  2868

gzip: stdin: not in gzip format
tar: Child returned status 1
tar: Error is not recoverable: exiting now
/usr/bin/python2.7: can't open file '/home/jenkins/jenkins-slave/workspace/AuroraBot/.home/.cache/pants/setup/bootstrap-Linux-x86_64/virtualenv-15.0.2/virtualenv.py': [Errno 2] No such file or directory
./pants: line 99: /home/jenkins/jenkins-slave/workspace/AuroraBot/.home/.cache/pants/setup/bootstrap-Linux-x86_64/1.4.0.dev23/bin/python: No such file or directory
Traceback (most recent call last):
  File "/dev/fd/63", line 7, in <module>
    
  File "/usr/lib/python2.7/json/__init__.py", line 290, in load
    **kw)
  File "/usr/lib/python2.7/json/__init__.py", line 338, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python2.7/json/decoder.py", line 366, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python2.7/json/decoder.py", line 384, in raw_decode
    raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded

Pants thrift version  does not match expected version 0.10.0!
:api:generateThriftJava FAILED

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':api:generateThriftJava'.
> Process 'command '/home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/thrift/thriftw'' finished with non-zero exit value 1

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output.

* Get more help at https://help.gradle.org

BUILD FAILED in 12s
7 actionable tasks: 1 executed, 6 up-to-date


I will refresh this build result if you post a review containing "@ReviewBot retry"

- Aurora ReviewBot


On Feb. 21, 2018, 4:30 p.m., Franck Cuny wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65735/
> -----------------------------------------------------------
> 
> (Updated Feb. 21, 2018, 4:30 p.m.)
> 
> 
> Review request for Aurora.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Summary:
> Running `job status` on a job with GPUs fails, because the resource type is not
> know. The solution is to add the GPU type to the resources.
> 
> Test Plan:
> Ran the unit tests and built the client:
> ```
> ./dist/aurora_internal.pex job status atla/cortex-m40/devel/aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001
>  INFO] Checking status of atla/cortex-m40/devel/aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001
> Active tasks (1):
> 	Task role: cortex-m40, env: devel, name: aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001, instance: 0, status: RUNNING on atla-erc-13-sr1.prod.twttr.net
> 	  CPU: 22.0 core(s), RAM: 131072 MB, Disk: 40960 MB, Port: http, GPU: 8 GPU(s)
> 	  assigned ports: {u'http': 31891}
> 	  failure count: 0 (max 1)
> 	  events:
> 	   2018-02-18 10:02:57 PENDING: None
> 	   2018-02-18 10:02:57 ASSIGNED: None
> 	   2018-02-18 10:02:58 STARTING: Initializing sandbox.
> 	   2018-02-18 10:03:00 RUNNING: No health-check defined, task is assumed healthy.
> 	  metadata:
> 		  (key: 'package', value: 'aaitken/twml_gpu version:7')
> 		  (key: 'package', value: 'aaitken/train_resnet version:212')
> 		  (key: 'package', value: 'aaitken/deps_pex version:62')
> Inactive tasks (0):
> ```
> 
> Differential Revision: https://phabricator.twitter.biz/D140046
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/config/resource.py 85e1d002e295feb312ba5d6033295705e0c2d2af 
>   src/test/python/apache/aurora/config/test_resources.py f43bad725e397d04817ccbf1c76229d572185d99 
> 
> 
> Diff: https://reviews.apache.org/r/65735/diff/3/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Franck Cuny
> 
>


Re: Review Request 65735: Add GPUs to the resources.

Posted by David McLaughlin <da...@dmclaughlin.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65735/#review197992
-----------------------------------------------------------



This patch has landed on master. Please feel free to close the review.

- David McLaughlin


On Feb. 21, 2018, 4:31 p.m., Franck Cuny wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65735/
> -----------------------------------------------------------
> 
> (Updated Feb. 21, 2018, 4:31 p.m.)
> 
> 
> Review request for Aurora.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Summary:
> Running `job status` on a job with GPUs fails, because the resource type is not
> know. The solution is to add the GPU type to the resources.
> 
> Test Plan:
> Ran the unit tests and built the client:
> ```
> ./dist/aurora_internal.pex job status <job name>
>  INFO] Checking status of <job name>
> Active tasks (1):
> 	Task role: <role>, env: <env>, name: <name>, instance: 0, status: RUNNING on <host>
> 	  CPU: 22.0 core(s), RAM: 131072 MB, Disk: 40960 MB, Port: http, GPU: 8 GPU(s)
> 	  assigned ports: {u'http': 31891}
> 	  failure count: 0 (max 1)
> 	  events:
> 	   2018-02-18 10:02:57 PENDING: None
> 	   2018-02-18 10:02:57 ASSIGNED: None
> 	   2018-02-18 10:02:58 STARTING: Initializing sandbox.
> 	   2018-02-18 10:03:00 RUNNING: No health-check defined, task is assumed healthy.
> 	  metadata:
> 		  (key: 'package', value: 'aaitken/twml_gpu version:7')
> 		  (key: 'package', value: 'aaitken/train_resnet version:212')
> 		  (key: 'package', value: 'aaitken/deps_pex version:62')
> Inactive tasks (0):
> ```
> 
> Differential Revision: https://phabricator.twitter.biz/D140046
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/config/resource.py 85e1d002e295feb312ba5d6033295705e0c2d2af 
>   src/test/python/apache/aurora/config/test_resources.py f43bad725e397d04817ccbf1c76229d572185d99 
> 
> 
> Diff: https://reviews.apache.org/r/65735/diff/4/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Franck Cuny
> 
>


Re: Review Request 65735: Add GPUs to the resources.

Posted by Aurora ReviewBot <wf...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65735/#review197935
-----------------------------------------------------------


Ship it!




Master (e2ea191) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot retry"

- Aurora ReviewBot


On Feb. 21, 2018, 4:31 p.m., Franck Cuny wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65735/
> -----------------------------------------------------------
> 
> (Updated Feb. 21, 2018, 4:31 p.m.)
> 
> 
> Review request for Aurora.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Summary:
> Running `job status` on a job with GPUs fails, because the resource type is not
> know. The solution is to add the GPU type to the resources.
> 
> Test Plan:
> Ran the unit tests and built the client:
> ```
> ./dist/aurora_internal.pex job status <job name>
>  INFO] Checking status of <job name>
> Active tasks (1):
> 	Task role: <role>, env: <env>, name: <name>, instance: 0, status: RUNNING on <host>
> 	  CPU: 22.0 core(s), RAM: 131072 MB, Disk: 40960 MB, Port: http, GPU: 8 GPU(s)
> 	  assigned ports: {u'http': 31891}
> 	  failure count: 0 (max 1)
> 	  events:
> 	   2018-02-18 10:02:57 PENDING: None
> 	   2018-02-18 10:02:57 ASSIGNED: None
> 	   2018-02-18 10:02:58 STARTING: Initializing sandbox.
> 	   2018-02-18 10:03:00 RUNNING: No health-check defined, task is assumed healthy.
> 	  metadata:
> 		  (key: 'package', value: 'aaitken/twml_gpu version:7')
> 		  (key: 'package', value: 'aaitken/train_resnet version:212')
> 		  (key: 'package', value: 'aaitken/deps_pex version:62')
> Inactive tasks (0):
> ```
> 
> Differential Revision: https://phabricator.twitter.biz/D140046
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/config/resource.py 85e1d002e295feb312ba5d6033295705e0c2d2af 
>   src/test/python/apache/aurora/config/test_resources.py f43bad725e397d04817ccbf1c76229d572185d99 
> 
> 
> Diff: https://reviews.apache.org/r/65735/diff/4/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Franck Cuny
> 
>


Re: Review Request 65735: Add GPUs to the resources.

Posted by Santhosh Kumar Shanmugham <sa...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65735/#review197972
-----------------------------------------------------------


Ship it!




Ship It!

- Santhosh Kumar Shanmugham


On Feb. 21, 2018, 8:31 a.m., Franck Cuny wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65735/
> -----------------------------------------------------------
> 
> (Updated Feb. 21, 2018, 8:31 a.m.)
> 
> 
> Review request for Aurora.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Summary:
> Running `job status` on a job with GPUs fails, because the resource type is not
> know. The solution is to add the GPU type to the resources.
> 
> Test Plan:
> Ran the unit tests and built the client:
> ```
> ./dist/aurora_internal.pex job status <job name>
>  INFO] Checking status of <job name>
> Active tasks (1):
> 	Task role: <role>, env: <env>, name: <name>, instance: 0, status: RUNNING on <host>
> 	  CPU: 22.0 core(s), RAM: 131072 MB, Disk: 40960 MB, Port: http, GPU: 8 GPU(s)
> 	  assigned ports: {u'http': 31891}
> 	  failure count: 0 (max 1)
> 	  events:
> 	   2018-02-18 10:02:57 PENDING: None
> 	   2018-02-18 10:02:57 ASSIGNED: None
> 	   2018-02-18 10:02:58 STARTING: Initializing sandbox.
> 	   2018-02-18 10:03:00 RUNNING: No health-check defined, task is assumed healthy.
> 	  metadata:
> 		  (key: 'package', value: 'aaitken/twml_gpu version:7')
> 		  (key: 'package', value: 'aaitken/train_resnet version:212')
> 		  (key: 'package', value: 'aaitken/deps_pex version:62')
> Inactive tasks (0):
> ```
> 
> Differential Revision: https://phabricator.twitter.biz/D140046
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/config/resource.py 85e1d002e295feb312ba5d6033295705e0c2d2af 
>   src/test/python/apache/aurora/config/test_resources.py f43bad725e397d04817ccbf1c76229d572185d99 
> 
> 
> Diff: https://reviews.apache.org/r/65735/diff/4/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Franck Cuny
> 
>


Re: Review Request 65735: Add GPUs to the resources.

Posted by Reza Motamedi <re...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65735/#review197963
-----------------------------------------------------------


Ship it!




Ship It!

- Reza Motamedi


On Feb. 21, 2018, 4:31 p.m., Franck Cuny wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65735/
> -----------------------------------------------------------
> 
> (Updated Feb. 21, 2018, 4:31 p.m.)
> 
> 
> Review request for Aurora.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Summary:
> Running `job status` on a job with GPUs fails, because the resource type is not
> know. The solution is to add the GPU type to the resources.
> 
> Test Plan:
> Ran the unit tests and built the client:
> ```
> ./dist/aurora_internal.pex job status <job name>
>  INFO] Checking status of <job name>
> Active tasks (1):
> 	Task role: <role>, env: <env>, name: <name>, instance: 0, status: RUNNING on <host>
> 	  CPU: 22.0 core(s), RAM: 131072 MB, Disk: 40960 MB, Port: http, GPU: 8 GPU(s)
> 	  assigned ports: {u'http': 31891}
> 	  failure count: 0 (max 1)
> 	  events:
> 	   2018-02-18 10:02:57 PENDING: None
> 	   2018-02-18 10:02:57 ASSIGNED: None
> 	   2018-02-18 10:02:58 STARTING: Initializing sandbox.
> 	   2018-02-18 10:03:00 RUNNING: No health-check defined, task is assumed healthy.
> 	  metadata:
> 		  (key: 'package', value: 'aaitken/twml_gpu version:7')
> 		  (key: 'package', value: 'aaitken/train_resnet version:212')
> 		  (key: 'package', value: 'aaitken/deps_pex version:62')
> Inactive tasks (0):
> ```
> 
> Differential Revision: https://phabricator.twitter.biz/D140046
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/config/resource.py 85e1d002e295feb312ba5d6033295705e0c2d2af 
>   src/test/python/apache/aurora/config/test_resources.py f43bad725e397d04817ccbf1c76229d572185d99 
> 
> 
> Diff: https://reviews.apache.org/r/65735/diff/4/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Franck Cuny
> 
>


Re: Review Request 65735: Add GPUs to the resources.

Posted by Franck Cuny via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65735/#review197924
-----------------------------------------------------------



@ReviewBot retry

- Franck Cuny


On Feb. 21, 2018, 4:31 p.m., Franck Cuny wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65735/
> -----------------------------------------------------------
> 
> (Updated Feb. 21, 2018, 4:31 p.m.)
> 
> 
> Review request for Aurora.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Summary:
> Running `job status` on a job with GPUs fails, because the resource type is not
> know. The solution is to add the GPU type to the resources.
> 
> Test Plan:
> Ran the unit tests and built the client:
> ```
> ./dist/aurora_internal.pex job status <job name>
>  INFO] Checking status of <job name>
> Active tasks (1):
> 	Task role: <role>, env: <env>, name: <name>, instance: 0, status: RUNNING on <host>
> 	  CPU: 22.0 core(s), RAM: 131072 MB, Disk: 40960 MB, Port: http, GPU: 8 GPU(s)
> 	  assigned ports: {u'http': 31891}
> 	  failure count: 0 (max 1)
> 	  events:
> 	   2018-02-18 10:02:57 PENDING: None
> 	   2018-02-18 10:02:57 ASSIGNED: None
> 	   2018-02-18 10:02:58 STARTING: Initializing sandbox.
> 	   2018-02-18 10:03:00 RUNNING: No health-check defined, task is assumed healthy.
> 	  metadata:
> 		  (key: 'package', value: 'aaitken/twml_gpu version:7')
> 		  (key: 'package', value: 'aaitken/train_resnet version:212')
> 		  (key: 'package', value: 'aaitken/deps_pex version:62')
> Inactive tasks (0):
> ```
> 
> Differential Revision: https://phabricator.twitter.biz/D140046
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/aurora/config/resource.py 85e1d002e295feb312ba5d6033295705e0c2d2af 
>   src/test/python/apache/aurora/config/test_resources.py f43bad725e397d04817ccbf1c76229d572185d99 
> 
> 
> Diff: https://reviews.apache.org/r/65735/diff/4/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Franck Cuny
> 
>


Re: Review Request 65735: Add GPUs to the resources.

Posted by Franck Cuny via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65735/
-----------------------------------------------------------

(Updated Feb. 21, 2018, 4:31 p.m.)


Review request for Aurora.


Repository: aurora


Description (updated)
-------

Summary:
Running `job status` on a job with GPUs fails, because the resource type is not
know. The solution is to add the GPU type to the resources.

Test Plan:
Ran the unit tests and built the client:
```
./dist/aurora_internal.pex job status <job name>
 INFO] Checking status of <job name>
Active tasks (1):
	Task role: <role>, env: <env>, name: <name>, instance: 0, status: RUNNING on <host>
	  CPU: 22.0 core(s), RAM: 131072 MB, Disk: 40960 MB, Port: http, GPU: 8 GPU(s)
	  assigned ports: {u'http': 31891}
	  failure count: 0 (max 1)
	  events:
	   2018-02-18 10:02:57 PENDING: None
	   2018-02-18 10:02:57 ASSIGNED: None
	   2018-02-18 10:02:58 STARTING: Initializing sandbox.
	   2018-02-18 10:03:00 RUNNING: No health-check defined, task is assumed healthy.
	  metadata:
		  (key: 'package', value: 'aaitken/twml_gpu version:7')
		  (key: 'package', value: 'aaitken/train_resnet version:212')
		  (key: 'package', value: 'aaitken/deps_pex version:62')
Inactive tasks (0):
```

Differential Revision: https://phabricator.twitter.biz/D140046


Diffs (updated)
-----

  src/main/python/apache/aurora/config/resource.py 85e1d002e295feb312ba5d6033295705e0c2d2af 
  src/test/python/apache/aurora/config/test_resources.py f43bad725e397d04817ccbf1c76229d572185d99 


Diff: https://reviews.apache.org/r/65735/diff/4/

Changes: https://reviews.apache.org/r/65735/diff/3-4/


Testing
-------


Thanks,

Franck Cuny


Re: Review Request 65735: Add GPUs to the resources.

Posted by Franck Cuny via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65735/
-----------------------------------------------------------

(Updated Feb. 21, 2018, 4:30 p.m.)


Review request for Aurora.


Repository: aurora


Description
-------

Summary:
Running `job status` on a job with GPUs fails, because the resource type is not
know. The solution is to add the GPU type to the resources.

Test Plan:
Ran the unit tests and built the client:
```
./dist/aurora_internal.pex job status atla/cortex-m40/devel/aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001
 INFO] Checking status of atla/cortex-m40/devel/aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001
Active tasks (1):
	Task role: cortex-m40, env: devel, name: aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001, instance: 0, status: RUNNING on atla-erc-13-sr1.prod.twttr.net
	  CPU: 22.0 core(s), RAM: 131072 MB, Disk: 40960 MB, Port: http, GPU: 8 GPU(s)
	  assigned ports: {u'http': 31891}
	  failure count: 0 (max 1)
	  events:
	   2018-02-18 10:02:57 PENDING: None
	   2018-02-18 10:02:57 ASSIGNED: None
	   2018-02-18 10:02:58 STARTING: Initializing sandbox.
	   2018-02-18 10:03:00 RUNNING: No health-check defined, task is assumed healthy.
	  metadata:
		  (key: 'package', value: 'aaitken/twml_gpu version:7')
		  (key: 'package', value: 'aaitken/train_resnet version:212')
		  (key: 'package', value: 'aaitken/deps_pex version:62')
Inactive tasks (0):
```

Differential Revision: https://phabricator.twitter.biz/D140046


Diffs (updated)
-----

  src/main/python/apache/aurora/config/resource.py 85e1d002e295feb312ba5d6033295705e0c2d2af 
  src/test/python/apache/aurora/config/test_resources.py f43bad725e397d04817ccbf1c76229d572185d99 


Diff: https://reviews.apache.org/r/65735/diff/3/

Changes: https://reviews.apache.org/r/65735/diff/2-3/


Testing
-------


Thanks,

Franck Cuny


Re: Review Request 65735: Add GPUs to the resources.

Posted by Franck Cuny via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65735/
-----------------------------------------------------------

(Updated Feb. 21, 2018, 4:23 p.m.)


Review request for Aurora.


Repository: aurora


Description
-------

Summary:
Running `job status` on a job with GPUs fails, because the resource type is not
know. The solution is to add the GPU type to the resources.

Test Plan:
Ran the unit tests and built the client:
```
./dist/aurora_internal.pex job status atla/cortex-m40/devel/aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001
 INFO] Checking status of atla/cortex-m40/devel/aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001
Active tasks (1):
	Task role: cortex-m40, env: devel, name: aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001, instance: 0, status: RUNNING on atla-erc-13-sr1.prod.twttr.net
	  CPU: 22.0 core(s), RAM: 131072 MB, Disk: 40960 MB, Port: http, GPU: 8 GPU(s)
	  assigned ports: {u'http': 31891}
	  failure count: 0 (max 1)
	  events:
	   2018-02-18 10:02:57 PENDING: None
	   2018-02-18 10:02:57 ASSIGNED: None
	   2018-02-18 10:02:58 STARTING: Initializing sandbox.
	   2018-02-18 10:03:00 RUNNING: No health-check defined, task is assumed healthy.
	  metadata:
		  (key: 'package', value: 'aaitken/twml_gpu version:7')
		  (key: 'package', value: 'aaitken/train_resnet version:212')
		  (key: 'package', value: 'aaitken/deps_pex version:62')
Inactive tasks (0):
```

Differential Revision: https://phabricator.twitter.biz/D140046


Diffs (updated)
-----

  src/main/python/apache/aurora/config/resource.py 85e1d002e295feb312ba5d6033295705e0c2d2af 
  src/test/python/apache/aurora/config/test_resources.py f43bad725e397d04817ccbf1c76229d572185d99 


Diff: https://reviews.apache.org/r/65735/diff/2/

Changes: https://reviews.apache.org/r/65735/diff/1-2/


Testing
-------


Thanks,

Franck Cuny