You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@madlib.apache.org by "Frank McQuillan (Jira)" <ji...@apache.org> on 2019/11/07 00:40:00 UTC

[jira] [Commented] (MADLIB-1390) DL: helper function for asymmetric cluster config

    [ https://issues.apache.org/jira/browse/MADLIB-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968826#comment-16968826 ] 

Frank McQuillan commented on MADLIB-1390:
-----------------------------------------


{code}
SELECT * FROM madlib.gpu_configuration();
 hostname |                                        gpu_descr                                         
----------+------------------------------------------------------------------------------------------
 phoenix0 | device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:04.0, compute capability: 6.0
 phoenix0 | device: 1, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:05.0, compute capability: 6.0
 phoenix0 | device: 2, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:06.0, compute capability: 6.0
 phoenix0 | device: 3, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:07.0, compute capability: 6.0
 phoenix1 | device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:04.0, compute capability: 6.0
 phoenix1 | device: 1, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:05.0, compute capability: 6.0
 phoenix1 | device: 2, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:06.0, compute capability: 6.0
 phoenix1 | device: 3, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:07.0, compute capability: 6.0
 phoenix2 | device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:04.0, compute capability: 6.0
 phoenix2 | device: 1, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:05.0, compute capability: 6.0
 phoenix2 | device: 2, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:06.0, compute capability: 6.0
 phoenix2 | device: 3, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:07.0, compute capability: 6.0
 phoenix3 | device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:04.0, compute capability: 6.0
 phoenix3 | device: 1, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:05.0, compute capability: 6.0
 phoenix3 | device: 2, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:06.0, compute capability: 6.0
 phoenix3 | device: 3, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:07.0, compute capability: 6.0
 phoenix4 | device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:04.0, compute capability: 6.0
 phoenix4 | device: 1, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:05.0, compute capability: 6.0
 phoenix4 | device: 2, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:06.0, compute capability: 6.0
 phoenix4 | device: 3, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:07.0, compute capability: 6.0
(20 rows)
{code}
OK

However the below does not work:
{code}
CREATE TABLE gpu_resources AS 
        SELECT * FROM madlib.gpu_configuration();

NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause. Creating a NULL policy entry.
ERROR:  plpy.SPIError: function cannot execute on segment because it issues a non-SELECT statement  (entry db 10.138.0.41:5432 pid=16886)
DETAIL:  
Traceback (most recent call last):
  PL/Python function "gpu_configuration", line 21, in <module>
    with AOControl(False) and MinWarning("error"):
  PL/Python function "gpu_configuration", line 154, in __enter__
PL/Python function "gpu_configuration"
SQL function "gpu_configuration" statement 1
{code}
I think this is a problem, so I think we should create an output table instead of using a set returning function.

> DL: helper function for asymmetric cluster config
> -------------------------------------------------
>
>                 Key: MADLIB-1390
>                 URL: https://issues.apache.org/jira/browse/MADLIB-1390
>             Project: Apache MADlib
>          Issue Type: Improvement
>          Components: Deep Learning
>            Reporter: Nikhil Kak
>            Priority: Major
>             Fix For: v1.17
>
>
> h3. Helper function
> *Interface*
> gpu_configuration(source text) –- optional param. Can be one of 'tensorflow' and 'nvidia'. Default to 'tensorflow'
> SELECT * FROM madlib.gpu_configuration();
> SELECT * FROM madlib.gpu_configuration('tensorflow');
> SELECT * FROM madlib.gpu_configuration('nvidia');
>  
> *GPDB*
> List the state of the cluster, however it is configured:
> SELECT * FROM madlib.gpu_configuration();
>  
>     gpu_descr     | hostname
> ------------------+--------------------------
> NVIDIA Tesla P100 | host1
> NVIDIA Tesla P100 | host1
> Super Duper GPU   | host2
> Super Duper GPU   | host2
> (4 rows)
>  
> Can return results in a slightly different format, depending on what info can be returned about the GPU.  But there should be 1 row per GPU.
>  
> Details:
>  # Ignore mirror segments and master host
>  
>  *POSTGRES*
> select * from madlib.gpu_configuration();
> hostname | gpu_descr
> ----------+------------
> localhost | NVIDIA Tesla P100
> (1 row)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)