You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@aurora.apache.org by Franck Cuny via Review Board <no...@reviews.apache.org> on 2018/02/21 15:59:52 UTC
Review Request 65735: Add GPUs to the resources.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65735/
-----------------------------------------------------------
Review request for Aurora.
Repository: aurora
Description
-------
Summary:
Running `job status` on a job with GPUs fails, because the resource type is not
know. The solution is to add the GPU type to the resources.
Test Plan:
Ran the unit tests and built the client:
```
./dist/aurora_internal.pex job status atla/cortex-m40/devel/aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001
INFO] Checking status of atla/cortex-m40/devel/aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001
Active tasks (1):
Task role: cortex-m40, env: devel, name: aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001, instance: 0, status: RUNNING on atla-erc-13-sr1.prod.twttr.net
CPU: 22.0 core(s), RAM: 131072 MB, Disk: 40960 MB, Port: http, GPU: 8 GPU(s)
assigned ports: {u'http': 31891}
failure count: 0 (max 1)
events:
2018-02-18 10:02:57 PENDING: None
2018-02-18 10:02:57 ASSIGNED: None
2018-02-18 10:02:58 STARTING: Initializing sandbox.
2018-02-18 10:03:00 RUNNING: No health-check defined, task is assumed healthy.
metadata:
(key: 'package', value: 'aaitken/twml_gpu version:7')
(key: 'package', value: 'aaitken/train_resnet version:212')
(key: 'package', value: 'aaitken/deps_pex version:62')
Inactive tasks (0):
```
Differential Revision: https://phabricator.twitter.biz/D140046
Diffs
-----
src/main/python/apache/aurora/config/resource.py 85e1d002e295feb312ba5d6033295705e0c2d2af
src/test/python/apache/aurora/config/test_resources.py f43bad725e397d04817ccbf1c76229d572185d99
Diff: https://reviews.apache.org/r/65735/diff/1/
Testing
-------
Thanks,
Franck Cuny
Re: Review Request 65735: Add GPUs to the resources.
Posted by David McLaughlin <da...@dmclaughlin.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65735/#review197909
-----------------------------------------------------------
Ship it!
Thanks for the patch!
src/test/python/apache/aurora/config/test_resources.py
Lines 27 (patched)
<https://reviews.apache.org/r/65735/#comment278159>
Nit: should be an int here and elsewhere in the test.
- David McLaughlin
On Feb. 21, 2018, 3:59 p.m., Franck Cuny wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65735/
> -----------------------------------------------------------
>
> (Updated Feb. 21, 2018, 3:59 p.m.)
>
>
> Review request for Aurora.
>
>
> Repository: aurora
>
>
> Description
> -------
>
> Summary:
> Running `job status` on a job with GPUs fails, because the resource type is not
> know. The solution is to add the GPU type to the resources.
>
> Test Plan:
> Ran the unit tests and built the client:
> ```
> ./dist/aurora_internal.pex job status atla/cortex-m40/devel/aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001
> INFO] Checking status of atla/cortex-m40/devel/aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001
> Active tasks (1):
> Task role: cortex-m40, env: devel, name: aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001, instance: 0, status: RUNNING on atla-erc-13-sr1.prod.twttr.net
> CPU: 22.0 core(s), RAM: 131072 MB, Disk: 40960 MB, Port: http, GPU: 8 GPU(s)
> assigned ports: {u'http': 31891}
> failure count: 0 (max 1)
> events:
> 2018-02-18 10:02:57 PENDING: None
> 2018-02-18 10:02:57 ASSIGNED: None
> 2018-02-18 10:02:58 STARTING: Initializing sandbox.
> 2018-02-18 10:03:00 RUNNING: No health-check defined, task is assumed healthy.
> metadata:
> (key: 'package', value: 'aaitken/twml_gpu version:7')
> (key: 'package', value: 'aaitken/train_resnet version:212')
> (key: 'package', value: 'aaitken/deps_pex version:62')
> Inactive tasks (0):
> ```
>
> Differential Revision: https://phabricator.twitter.biz/D140046
>
>
> Diffs
> -----
>
> src/main/python/apache/aurora/config/resource.py 85e1d002e295feb312ba5d6033295705e0c2d2af
> src/test/python/apache/aurora/config/test_resources.py f43bad725e397d04817ccbf1c76229d572185d99
>
>
> Diff: https://reviews.apache.org/r/65735/diff/1/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Franck Cuny
>
>
Re: Review Request 65735: Add GPUs to the resources.
Posted by Aurora ReviewBot <wf...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65735/#review197912
-----------------------------------------------------------
Ship it!
Master (e2ea191) is green with this patch.
./build-support/jenkins/build.sh
I will refresh this build result if you post a review containing "@ReviewBot retry"
- Aurora ReviewBot
On Feb. 21, 2018, 3:59 p.m., Franck Cuny wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65735/
> -----------------------------------------------------------
>
> (Updated Feb. 21, 2018, 3:59 p.m.)
>
>
> Review request for Aurora.
>
>
> Repository: aurora
>
>
> Description
> -------
>
> Summary:
> Running `job status` on a job with GPUs fails, because the resource type is not
> know. The solution is to add the GPU type to the resources.
>
> Test Plan:
> Ran the unit tests and built the client:
> ```
> ./dist/aurora_internal.pex job status atla/cortex-m40/devel/aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001
> INFO] Checking status of atla/cortex-m40/devel/aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001
> Active tasks (1):
> Task role: cortex-m40, env: devel, name: aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001, instance: 0, status: RUNNING on atla-erc-13-sr1.prod.twttr.net
> CPU: 22.0 core(s), RAM: 131072 MB, Disk: 40960 MB, Port: http, GPU: 8 GPU(s)
> assigned ports: {u'http': 31891}
> failure count: 0 (max 1)
> events:
> 2018-02-18 10:02:57 PENDING: None
> 2018-02-18 10:02:57 ASSIGNED: None
> 2018-02-18 10:02:58 STARTING: Initializing sandbox.
> 2018-02-18 10:03:00 RUNNING: No health-check defined, task is assumed healthy.
> metadata:
> (key: 'package', value: 'aaitken/twml_gpu version:7')
> (key: 'package', value: 'aaitken/train_resnet version:212')
> (key: 'package', value: 'aaitken/deps_pex version:62')
> Inactive tasks (0):
> ```
>
> Differential Revision: https://phabricator.twitter.biz/D140046
>
>
> Diffs
> -----
>
> src/main/python/apache/aurora/config/resource.py 85e1d002e295feb312ba5d6033295705e0c2d2af
> src/test/python/apache/aurora/config/test_resources.py f43bad725e397d04817ccbf1c76229d572185d99
>
>
> Diff: https://reviews.apache.org/r/65735/diff/1/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Franck Cuny
>
>
Re: Review Request 65735: Add GPUs to the resources.
Posted by Aurora ReviewBot <wf...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65735/#review197920
-----------------------------------------------------------
Master (e2ea191) is red with this patch.
./build-support/jenkins/build.sh
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
0 122 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 454 100 454 0 0 2868 0 --:--:-- --:--:-- --:--:-- 2868
gzip: stdin: not in gzip format
tar: Child returned status 1
tar: Error is not recoverable: exiting now
/usr/bin/python2.7: can't open file '/home/jenkins/jenkins-slave/workspace/AuroraBot/.home/.cache/pants/setup/bootstrap-Linux-x86_64/virtualenv-15.0.2/virtualenv.py': [Errno 2] No such file or directory
./pants: line 99: /home/jenkins/jenkins-slave/workspace/AuroraBot/.home/.cache/pants/setup/bootstrap-Linux-x86_64/1.4.0.dev23/bin/python: No such file or directory
Traceback (most recent call last):
File "/dev/fd/63", line 7, in <module>
File "/usr/lib/python2.7/json/__init__.py", line 290, in load
**kw)
File "/usr/lib/python2.7/json/__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "/usr/lib/python2.7/json/decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python2.7/json/decoder.py", line 384, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
Pants thrift version does not match expected version 0.10.0!
:api:generateThriftJava FAILED
FAILURE: Build failed with an exception.
* What went wrong:
Execution failed for task ':api:generateThriftJava'.
> Process 'command '/home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/thrift/thriftw'' finished with non-zero exit value 1
* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output.
* Get more help at https://help.gradle.org
BUILD FAILED in 12s
7 actionable tasks: 1 executed, 6 up-to-date
I will refresh this build result if you post a review containing "@ReviewBot retry"
- Aurora ReviewBot
On Feb. 21, 2018, 4:30 p.m., Franck Cuny wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65735/
> -----------------------------------------------------------
>
> (Updated Feb. 21, 2018, 4:30 p.m.)
>
>
> Review request for Aurora.
>
>
> Repository: aurora
>
>
> Description
> -------
>
> Summary:
> Running `job status` on a job with GPUs fails, because the resource type is not
> know. The solution is to add the GPU type to the resources.
>
> Test Plan:
> Ran the unit tests and built the client:
> ```
> ./dist/aurora_internal.pex job status atla/cortex-m40/devel/aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001
> INFO] Checking status of atla/cortex-m40/devel/aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001
> Active tasks (1):
> Task role: cortex-m40, env: devel, name: aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001, instance: 0, status: RUNNING on atla-erc-13-sr1.prod.twttr.net
> CPU: 22.0 core(s), RAM: 131072 MB, Disk: 40960 MB, Port: http, GPU: 8 GPU(s)
> assigned ports: {u'http': 31891}
> failure count: 0 (max 1)
> events:
> 2018-02-18 10:02:57 PENDING: None
> 2018-02-18 10:02:57 ASSIGNED: None
> 2018-02-18 10:02:58 STARTING: Initializing sandbox.
> 2018-02-18 10:03:00 RUNNING: No health-check defined, task is assumed healthy.
> metadata:
> (key: 'package', value: 'aaitken/twml_gpu version:7')
> (key: 'package', value: 'aaitken/train_resnet version:212')
> (key: 'package', value: 'aaitken/deps_pex version:62')
> Inactive tasks (0):
> ```
>
> Differential Revision: https://phabricator.twitter.biz/D140046
>
>
> Diffs
> -----
>
> src/main/python/apache/aurora/config/resource.py 85e1d002e295feb312ba5d6033295705e0c2d2af
> src/test/python/apache/aurora/config/test_resources.py f43bad725e397d04817ccbf1c76229d572185d99
>
>
> Diff: https://reviews.apache.org/r/65735/diff/3/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Franck Cuny
>
>
Re: Review Request 65735: Add GPUs to the resources.
Posted by David McLaughlin <da...@dmclaughlin.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65735/#review197992
-----------------------------------------------------------
This patch has landed on master. Please feel free to close the review.
- David McLaughlin
On Feb. 21, 2018, 4:31 p.m., Franck Cuny wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65735/
> -----------------------------------------------------------
>
> (Updated Feb. 21, 2018, 4:31 p.m.)
>
>
> Review request for Aurora.
>
>
> Repository: aurora
>
>
> Description
> -------
>
> Summary:
> Running `job status` on a job with GPUs fails, because the resource type is not
> know. The solution is to add the GPU type to the resources.
>
> Test Plan:
> Ran the unit tests and built the client:
> ```
> ./dist/aurora_internal.pex job status <job name>
> INFO] Checking status of <job name>
> Active tasks (1):
> Task role: <role>, env: <env>, name: <name>, instance: 0, status: RUNNING on <host>
> CPU: 22.0 core(s), RAM: 131072 MB, Disk: 40960 MB, Port: http, GPU: 8 GPU(s)
> assigned ports: {u'http': 31891}
> failure count: 0 (max 1)
> events:
> 2018-02-18 10:02:57 PENDING: None
> 2018-02-18 10:02:57 ASSIGNED: None
> 2018-02-18 10:02:58 STARTING: Initializing sandbox.
> 2018-02-18 10:03:00 RUNNING: No health-check defined, task is assumed healthy.
> metadata:
> (key: 'package', value: 'aaitken/twml_gpu version:7')
> (key: 'package', value: 'aaitken/train_resnet version:212')
> (key: 'package', value: 'aaitken/deps_pex version:62')
> Inactive tasks (0):
> ```
>
> Differential Revision: https://phabricator.twitter.biz/D140046
>
>
> Diffs
> -----
>
> src/main/python/apache/aurora/config/resource.py 85e1d002e295feb312ba5d6033295705e0c2d2af
> src/test/python/apache/aurora/config/test_resources.py f43bad725e397d04817ccbf1c76229d572185d99
>
>
> Diff: https://reviews.apache.org/r/65735/diff/4/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Franck Cuny
>
>
Re: Review Request 65735: Add GPUs to the resources.
Posted by Aurora ReviewBot <wf...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65735/#review197935
-----------------------------------------------------------
Ship it!
Master (e2ea191) is green with this patch.
./build-support/jenkins/build.sh
I will refresh this build result if you post a review containing "@ReviewBot retry"
- Aurora ReviewBot
On Feb. 21, 2018, 4:31 p.m., Franck Cuny wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65735/
> -----------------------------------------------------------
>
> (Updated Feb. 21, 2018, 4:31 p.m.)
>
>
> Review request for Aurora.
>
>
> Repository: aurora
>
>
> Description
> -------
>
> Summary:
> Running `job status` on a job with GPUs fails, because the resource type is not
> know. The solution is to add the GPU type to the resources.
>
> Test Plan:
> Ran the unit tests and built the client:
> ```
> ./dist/aurora_internal.pex job status <job name>
> INFO] Checking status of <job name>
> Active tasks (1):
> Task role: <role>, env: <env>, name: <name>, instance: 0, status: RUNNING on <host>
> CPU: 22.0 core(s), RAM: 131072 MB, Disk: 40960 MB, Port: http, GPU: 8 GPU(s)
> assigned ports: {u'http': 31891}
> failure count: 0 (max 1)
> events:
> 2018-02-18 10:02:57 PENDING: None
> 2018-02-18 10:02:57 ASSIGNED: None
> 2018-02-18 10:02:58 STARTING: Initializing sandbox.
> 2018-02-18 10:03:00 RUNNING: No health-check defined, task is assumed healthy.
> metadata:
> (key: 'package', value: 'aaitken/twml_gpu version:7')
> (key: 'package', value: 'aaitken/train_resnet version:212')
> (key: 'package', value: 'aaitken/deps_pex version:62')
> Inactive tasks (0):
> ```
>
> Differential Revision: https://phabricator.twitter.biz/D140046
>
>
> Diffs
> -----
>
> src/main/python/apache/aurora/config/resource.py 85e1d002e295feb312ba5d6033295705e0c2d2af
> src/test/python/apache/aurora/config/test_resources.py f43bad725e397d04817ccbf1c76229d572185d99
>
>
> Diff: https://reviews.apache.org/r/65735/diff/4/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Franck Cuny
>
>
Re: Review Request 65735: Add GPUs to the resources.
Posted by Santhosh Kumar Shanmugham <sa...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65735/#review197972
-----------------------------------------------------------
Ship it!
Ship It!
- Santhosh Kumar Shanmugham
On Feb. 21, 2018, 8:31 a.m., Franck Cuny wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65735/
> -----------------------------------------------------------
>
> (Updated Feb. 21, 2018, 8:31 a.m.)
>
>
> Review request for Aurora.
>
>
> Repository: aurora
>
>
> Description
> -------
>
> Summary:
> Running `job status` on a job with GPUs fails, because the resource type is not
> know. The solution is to add the GPU type to the resources.
>
> Test Plan:
> Ran the unit tests and built the client:
> ```
> ./dist/aurora_internal.pex job status <job name>
> INFO] Checking status of <job name>
> Active tasks (1):
> Task role: <role>, env: <env>, name: <name>, instance: 0, status: RUNNING on <host>
> CPU: 22.0 core(s), RAM: 131072 MB, Disk: 40960 MB, Port: http, GPU: 8 GPU(s)
> assigned ports: {u'http': 31891}
> failure count: 0 (max 1)
> events:
> 2018-02-18 10:02:57 PENDING: None
> 2018-02-18 10:02:57 ASSIGNED: None
> 2018-02-18 10:02:58 STARTING: Initializing sandbox.
> 2018-02-18 10:03:00 RUNNING: No health-check defined, task is assumed healthy.
> metadata:
> (key: 'package', value: 'aaitken/twml_gpu version:7')
> (key: 'package', value: 'aaitken/train_resnet version:212')
> (key: 'package', value: 'aaitken/deps_pex version:62')
> Inactive tasks (0):
> ```
>
> Differential Revision: https://phabricator.twitter.biz/D140046
>
>
> Diffs
> -----
>
> src/main/python/apache/aurora/config/resource.py 85e1d002e295feb312ba5d6033295705e0c2d2af
> src/test/python/apache/aurora/config/test_resources.py f43bad725e397d04817ccbf1c76229d572185d99
>
>
> Diff: https://reviews.apache.org/r/65735/diff/4/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Franck Cuny
>
>
Re: Review Request 65735: Add GPUs to the resources.
Posted by Reza Motamedi <re...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65735/#review197963
-----------------------------------------------------------
Ship it!
Ship It!
- Reza Motamedi
On Feb. 21, 2018, 4:31 p.m., Franck Cuny wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65735/
> -----------------------------------------------------------
>
> (Updated Feb. 21, 2018, 4:31 p.m.)
>
>
> Review request for Aurora.
>
>
> Repository: aurora
>
>
> Description
> -------
>
> Summary:
> Running `job status` on a job with GPUs fails, because the resource type is not
> know. The solution is to add the GPU type to the resources.
>
> Test Plan:
> Ran the unit tests and built the client:
> ```
> ./dist/aurora_internal.pex job status <job name>
> INFO] Checking status of <job name>
> Active tasks (1):
> Task role: <role>, env: <env>, name: <name>, instance: 0, status: RUNNING on <host>
> CPU: 22.0 core(s), RAM: 131072 MB, Disk: 40960 MB, Port: http, GPU: 8 GPU(s)
> assigned ports: {u'http': 31891}
> failure count: 0 (max 1)
> events:
> 2018-02-18 10:02:57 PENDING: None
> 2018-02-18 10:02:57 ASSIGNED: None
> 2018-02-18 10:02:58 STARTING: Initializing sandbox.
> 2018-02-18 10:03:00 RUNNING: No health-check defined, task is assumed healthy.
> metadata:
> (key: 'package', value: 'aaitken/twml_gpu version:7')
> (key: 'package', value: 'aaitken/train_resnet version:212')
> (key: 'package', value: 'aaitken/deps_pex version:62')
> Inactive tasks (0):
> ```
>
> Differential Revision: https://phabricator.twitter.biz/D140046
>
>
> Diffs
> -----
>
> src/main/python/apache/aurora/config/resource.py 85e1d002e295feb312ba5d6033295705e0c2d2af
> src/test/python/apache/aurora/config/test_resources.py f43bad725e397d04817ccbf1c76229d572185d99
>
>
> Diff: https://reviews.apache.org/r/65735/diff/4/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Franck Cuny
>
>
Re: Review Request 65735: Add GPUs to the resources.
Posted by Franck Cuny via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65735/#review197924
-----------------------------------------------------------
@ReviewBot retry
- Franck Cuny
On Feb. 21, 2018, 4:31 p.m., Franck Cuny wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65735/
> -----------------------------------------------------------
>
> (Updated Feb. 21, 2018, 4:31 p.m.)
>
>
> Review request for Aurora.
>
>
> Repository: aurora
>
>
> Description
> -------
>
> Summary:
> Running `job status` on a job with GPUs fails, because the resource type is not
> know. The solution is to add the GPU type to the resources.
>
> Test Plan:
> Ran the unit tests and built the client:
> ```
> ./dist/aurora_internal.pex job status <job name>
> INFO] Checking status of <job name>
> Active tasks (1):
> Task role: <role>, env: <env>, name: <name>, instance: 0, status: RUNNING on <host>
> CPU: 22.0 core(s), RAM: 131072 MB, Disk: 40960 MB, Port: http, GPU: 8 GPU(s)
> assigned ports: {u'http': 31891}
> failure count: 0 (max 1)
> events:
> 2018-02-18 10:02:57 PENDING: None
> 2018-02-18 10:02:57 ASSIGNED: None
> 2018-02-18 10:02:58 STARTING: Initializing sandbox.
> 2018-02-18 10:03:00 RUNNING: No health-check defined, task is assumed healthy.
> metadata:
> (key: 'package', value: 'aaitken/twml_gpu version:7')
> (key: 'package', value: 'aaitken/train_resnet version:212')
> (key: 'package', value: 'aaitken/deps_pex version:62')
> Inactive tasks (0):
> ```
>
> Differential Revision: https://phabricator.twitter.biz/D140046
>
>
> Diffs
> -----
>
> src/main/python/apache/aurora/config/resource.py 85e1d002e295feb312ba5d6033295705e0c2d2af
> src/test/python/apache/aurora/config/test_resources.py f43bad725e397d04817ccbf1c76229d572185d99
>
>
> Diff: https://reviews.apache.org/r/65735/diff/4/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Franck Cuny
>
>
Re: Review Request 65735: Add GPUs to the resources.
Posted by Franck Cuny via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65735/
-----------------------------------------------------------
(Updated Feb. 21, 2018, 4:31 p.m.)
Review request for Aurora.
Repository: aurora
Description (updated)
-------
Summary:
Running `job status` on a job with GPUs fails, because the resource type is not
know. The solution is to add the GPU type to the resources.
Test Plan:
Ran the unit tests and built the client:
```
./dist/aurora_internal.pex job status <job name>
INFO] Checking status of <job name>
Active tasks (1):
Task role: <role>, env: <env>, name: <name>, instance: 0, status: RUNNING on <host>
CPU: 22.0 core(s), RAM: 131072 MB, Disk: 40960 MB, Port: http, GPU: 8 GPU(s)
assigned ports: {u'http': 31891}
failure count: 0 (max 1)
events:
2018-02-18 10:02:57 PENDING: None
2018-02-18 10:02:57 ASSIGNED: None
2018-02-18 10:02:58 STARTING: Initializing sandbox.
2018-02-18 10:03:00 RUNNING: No health-check defined, task is assumed healthy.
metadata:
(key: 'package', value: 'aaitken/twml_gpu version:7')
(key: 'package', value: 'aaitken/train_resnet version:212')
(key: 'package', value: 'aaitken/deps_pex version:62')
Inactive tasks (0):
```
Differential Revision: https://phabricator.twitter.biz/D140046
Diffs (updated)
-----
src/main/python/apache/aurora/config/resource.py 85e1d002e295feb312ba5d6033295705e0c2d2af
src/test/python/apache/aurora/config/test_resources.py f43bad725e397d04817ccbf1c76229d572185d99
Diff: https://reviews.apache.org/r/65735/diff/4/
Changes: https://reviews.apache.org/r/65735/diff/3-4/
Testing
-------
Thanks,
Franck Cuny
Re: Review Request 65735: Add GPUs to the resources.
Posted by Franck Cuny via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65735/
-----------------------------------------------------------
(Updated Feb. 21, 2018, 4:30 p.m.)
Review request for Aurora.
Repository: aurora
Description
-------
Summary:
Running `job status` on a job with GPUs fails, because the resource type is not
know. The solution is to add the GPU type to the resources.
Test Plan:
Ran the unit tests and built the client:
```
./dist/aurora_internal.pex job status atla/cortex-m40/devel/aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001
INFO] Checking status of atla/cortex-m40/devel/aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001
Active tasks (1):
Task role: cortex-m40, env: devel, name: aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001, instance: 0, status: RUNNING on atla-erc-13-sr1.prod.twttr.net
CPU: 22.0 core(s), RAM: 131072 MB, Disk: 40960 MB, Port: http, GPU: 8 GPU(s)
assigned ports: {u'http': 31891}
failure count: 0 (max 1)
events:
2018-02-18 10:02:57 PENDING: None
2018-02-18 10:02:57 ASSIGNED: None
2018-02-18 10:02:58 STARTING: Initializing sandbox.
2018-02-18 10:03:00 RUNNING: No health-check defined, task is assumed healthy.
metadata:
(key: 'package', value: 'aaitken/twml_gpu version:7')
(key: 'package', value: 'aaitken/train_resnet version:212')
(key: 'package', value: 'aaitken/deps_pex version:62')
Inactive tasks (0):
```
Differential Revision: https://phabricator.twitter.biz/D140046
Diffs (updated)
-----
src/main/python/apache/aurora/config/resource.py 85e1d002e295feb312ba5d6033295705e0c2d2af
src/test/python/apache/aurora/config/test_resources.py f43bad725e397d04817ccbf1c76229d572185d99
Diff: https://reviews.apache.org/r/65735/diff/3/
Changes: https://reviews.apache.org/r/65735/diff/2-3/
Testing
-------
Thanks,
Franck Cuny
Re: Review Request 65735: Add GPUs to the resources.
Posted by Franck Cuny via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65735/
-----------------------------------------------------------
(Updated Feb. 21, 2018, 4:23 p.m.)
Review request for Aurora.
Repository: aurora
Description
-------
Summary:
Running `job status` on a job with GPUs fails, because the resource type is not
know. The solution is to add the GPU type to the resources.
Test Plan:
Ran the unit tests and built the client:
```
./dist/aurora_internal.pex job status atla/cortex-m40/devel/aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001
INFO] Checking status of atla/cortex-m40/devel/aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001
Active tasks (1):
Task role: cortex-m40, env: devel, name: aaitken_train_resnet_last_layer_vgg_preproc_lr_0p00001, instance: 0, status: RUNNING on atla-erc-13-sr1.prod.twttr.net
CPU: 22.0 core(s), RAM: 131072 MB, Disk: 40960 MB, Port: http, GPU: 8 GPU(s)
assigned ports: {u'http': 31891}
failure count: 0 (max 1)
events:
2018-02-18 10:02:57 PENDING: None
2018-02-18 10:02:57 ASSIGNED: None
2018-02-18 10:02:58 STARTING: Initializing sandbox.
2018-02-18 10:03:00 RUNNING: No health-check defined, task is assumed healthy.
metadata:
(key: 'package', value: 'aaitken/twml_gpu version:7')
(key: 'package', value: 'aaitken/train_resnet version:212')
(key: 'package', value: 'aaitken/deps_pex version:62')
Inactive tasks (0):
```
Differential Revision: https://phabricator.twitter.biz/D140046
Diffs (updated)
-----
src/main/python/apache/aurora/config/resource.py 85e1d002e295feb312ba5d6033295705e0c2d2af
src/test/python/apache/aurora/config/test_resources.py f43bad725e397d04817ccbf1c76229d572185d99
Diff: https://reviews.apache.org/r/65735/diff/2/
Changes: https://reviews.apache.org/r/65735/diff/1-2/
Testing
-------
Thanks,
Franck Cuny