You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@madlib.apache.org by "Frank McQuillan (Jira)" <ji...@apache.org> on 2020/08/06 23:37:00 UTC
[jira] [Created] (MADLIB-1446) DL: Hyperband phase 2 - generate
MST table
Frank McQuillan created MADLIB-1446:
---------------------------------------
Summary: DL: Hyperband phase 2 - generate MST table
Key: MADLIB-1446
URL: https://issues.apache.org/jira/browse/MADLIB-1446
Project: Apache MADlib
Issue Type: New Feature
Components: Deep Learning
Reporter: Frank McQuillan
Fix For: v1.18.0
Python code to do some version of this is in https://github.com/apache/madlib-site/blob/asf-site/community-artifacts/Deep-learning/automl/hyperband-diag-cifar10-v1.ipynb in methods called `setup_full_schedule()` and `create_mst_superset()` + combine with the random search function from https://www.pivotaltracker.com/story/show/173692930
**Story***
Generate the MST table and do input validation on input params (to the extent possible without implementing the whole method). It does not do the whole hyperband method. The proposed interface:
{code}
madlib_keras_automl(
source_table, -- input
model_output_table, -- output
model_selection_table, -- output
model_arch_table, -- input
model_id_list,
compile_params_grid,
fit_params_grid,
automl_method, -- new params vvv
automl_params
random_state, -- optional -- from generate model configs vvv
object_table -- optional
use_gpus, -- optional -- from fit multiple vvv
validation_table, -- optional
metrics_compute_frequency, -- optional
name, -- optional
description -- optional
)
{code}
Here are the output tables:
(1)
<model_output_table>
Same as model output table in
https://madlib.apache.org/docs/latest/group__grp__keras__run__model__selection.html
e.g., for R=81 and n=3 will have 81+27+9+6+5 rows
(2)
<model_output_table>_summary
Same as model output table summary in
https://madlib.apache.org/docs/latest/group__grp__keras__run__model__selection.html
will have 1 row + add the following columns at the bottom, i.e., right side of the table:
{code}
use_gpus BOOLEAN e.g., TRUE -- this is missing from summary table from before
automl_method TEXT e.g., 'hyperband'
automl_params_names TEXT[] e.g., {'R', 'eta', 'skip_last' }
automl_params_vals TEXT[] e.g., {'81', '3', 'TRUE'} -- note this needs to be text array since mixed types of autoML params
{code}
(3)
<model_output_table>_info
Same as model output table info in
https://madlib.apache.org/docs/latest/group__grp__keras__run__model__selection.html
e.g., for R=81 and n=3 will have 81+27+9+6+5 rows + add the following columns at the bottom, i.e., right side of the table:
{code}
s INTEGER "Bracket number" e.g., 4
i INTEGER "Depth in bracket model trained to" e.g., 3
{code}
(4)
<model_selection_table>
Same as model selection table in
https://madlib.apache.org/docs/latest/group__grp__keras__setup__model__selection.html
e.g., for R=81 and n=3 will have 81+27+9+6+5 rows
(5)
<model_selection_table>_summary
Same as model selection table in
https://madlib.apache.org/docs/latest/group__grp__keras__setup__model__selection.html
**Acceptance**
1) For `R=81, eta=3` check that it creates the correct MST tables <model_selection_table> and <model_selection_table>_summary
2) Set `skip_last =1` and check that it creates the correct MST tables
3) Try multiple other values to see if produces the correct schedule
--
This message was sent by Atlassian Jira
(v8.3.4#803005)