You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@madlib.apache.org by GitBox <gi...@apache.org> on 2021/01/13 22:57:47 UTC

[GitHub] [madlib] khannaekta commented on a change in pull request #525: DL: Model Hopper Refactor

khannaekta commented on a change in pull request #525:
URL: https://github.com/apache/madlib/pull/525#discussion_r556801794



##########
File path: src/ports/postgres/modules/deep_learning/test/madlib_keras_fit_multiple.sql_in
##########
@@ -0,0 +1,845 @@
+m4_include(`SQLCommon.m4')
+m4_changequote(<<<,>>>)
+m4_ifdef(<<<__POSTGRESQL__>>>, -- Skip all fit multiple tests for postgres
+,<<<
+m4_changequote(<!,!>)
+
+-- =================== Setup & Initialization for FitMultiple tests ========================
+--
+--  For fit multiple, we test end-to-end functionality along with performance elsewhere.
+--  They take a long time to run.  Including similar tests here would probably not be worth
+--  the extra time added to dev-check.
+--
+--  Instead, we just want to unit test different python functions in the FitMultiple class.
+--  However, most of the important behavior we need to test requires access to an actual
+--  Greenplum database... mostly, we want to make sure that the models hop around to the
+--  right segments in the right order.  Therefore, the unit tests are here, as a part of
+--  dev-check. we mock fit_transition() and some validation functions in FitMultiple, but
+--  do NOT mock plpy, since most of the code we want to test is embedded SQL and needs to
+--  get through to gpdb. We also want to mock the number of segments, so we can test what
+--  the model hopping behavior will be for a large cluster, even though dev-check should be
+--  able to run on a single dev host.
+
+\i m4_regexp(MODULE_PATHNAME,
+             <!\(.*\)libmadlib\.so!>,
+            <!\1../../modules/deep_learning/test/madlib_keras_iris.setup.sql_in!>
+)
+
+-- Mock version() function to convince the InputValidator this is the real madlib schema
+CREATE OR REPLACE FUNCTION madlib_installcheck_deep_learning.version() RETURNS VARCHAR AS
+$$
+    SELECT MADLIB_SCHEMA.version();
+$$ LANGUAGE sql IMMUTABLE;
+
+-- Call this first to initialize the FitMultiple object, before anything else happens.
+-- Pass a real mst table and source table, rest of FitMultipleModel() constructor params
+--  are filled in.  They can be overriden later, before test functions are called, if necessary.
+CREATE OR REPLACE FUNCTION init_fit_mult(
+    source_table            VARCHAR,
+    model_selection_table   VARCHAR
+) RETURNS VOID AS
+$$
+    import sys
+    from mock import Mock, patch
+
+    PythonFunctionBodyOnlyNoSchema(deep_learning,madlib_keras_fit_multiple_model)
+    schema_madlib = 'madlib_installcheck_deep_learning'
+
+    GD['fit_mult'] = madlib_keras_fit_multiple_model.FitMultipleModel(
+        schema_madlib,
+        source_table,
+        'orig_model_out',
+        model_selection_table,
+        1
+    )
+    
+$$ LANGUAGE plpythonu VOLATILE
+m4_ifdef(<!__HAS_FUNCTION_PROPERTIES__!>, MODIFIES SQL DATA);
+
+CREATE OR REPLACE FUNCTION test_init_schedule(
+    schedule_table VARCHAR
+) RETURNS BOOLEAN AS
+$$
+    fit_mult = GD['fit_mult']
+    fit_mult.schedule_tbl = schedule_table
+
+    plpy.execute('DROP TABLE IF EXISTS {}'.format(schedule_table))
+    if fit_mult.init_schedule_tbl():
+        err_msg = None
+    else:
+        err_msg = 'FitMultiple.init_schedule_tbl() returned False'
+
+    return err_msg
+$$ LANGUAGE plpythonu VOLATILE
+m4_ifdef(`__HAS_FUNCTION_PROPERTIES__',MODIFIES SQL DATA);
+
+CREATE OR REPLACE FUNCTION test_rotate_schedule(
+    schedule_table          VARCHAR
+) RETURNS VOID AS
+$$
+    fit_mult = GD['fit_mult']
+
+    if fit_mult.schedule_tbl != schedule_table:
+        fit_mult.init_schedule_tbl()
+
+    fit_mult.rotate_schedule_tbl()
+
+$$ LANGUAGE plpythonu VOLATILE
+m4_ifdef(`__HAS_FUNCTION_PROPERTIES__',MODIFIES SQL DATA);
+
+-- Mock fit_transition function, for testing
+--  madlib_keras_fit_multiple_model() python code
+CREATE OR REPLACE FUNCTION madlib_installcheck_deep_learning.fit_transition_multiple_model(

Review comment:
       I am wondering if CREATE/REPLACE here for this function would cause issues.
   Does this  function  get dropped once this test is over? If not, then for the tests that run after this whenever we call `fit_transition_multiple_model()`, won’t it refer to this function instead of the actual madlib.fit_transition_multiple_model()?
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org