You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@madlib.apache.org by GitBox <gi...@apache.org> on 2019/05/31 23:57:22 UTC

[GitHub] [madlib] fmcquillan99 edited a comment on issue #395: DL: madlib_keras_evaluate() function

fmcquillan99 edited a comment on issue #395: DL: madlib_keras_evaluate() function
URL: https://github.com/apache/madlib/pull/395#issuecomment-497890456
 
 
   Please have a look at (4) and (5) below:
   (4) please provide a better warning
   (5) better message will be in a later commit
   
   
   (0)
   interface
   
   ```
   CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.madlib_keras_evaluate(
       model_table             VARCHAR,
       test_table              VARCHAR,
       output_table            VARCHAR,
       gpus_per_host           INTEGER
   ```
   
   
   (1)
   train a model
   
   ```
   DROP TABLE IF EXISTS iris_model, iris_model_summary;
   
   SELECT madlib.madlib_keras_fit('iris_train_packed',   -- source table
                                  'iris_model',          -- model output table
                                  'model_arch_library',  -- model arch table
                                   1,                    -- model arch id
                                   $$ loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'] $$,  -- compile_params
                                   $$ batch_size=5, epochs=3 $$,  -- fit_params
                                   20                    -- num_iterations
                                 );
   
   DROP TABLE IF EXISTS iris_model, iris_model_summary;
   
   SELECT madlib.madlib_keras_fit('iris_train_packed',   -- source table
                                  'iris_model',          -- model output table
                                  'model_arch_library',  -- model arch table
                                   1,                    -- model arch id
                                   $$ loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'] $$,  -- compile_params
                                   $$ batch_size=5, epochs=3 $$,  -- fit_params
                                   20,                   -- num_iterations
                                   0,                    -- GPUs per host
                                   'iris_test_packed'   -- validation dataset
                                 );
   
   
   INFO:  Training set metric after iteration 20: 0.975000023842.
   CONTEXT:  PL/Python function "madlib_keras_fit"
   INFO:  Training set loss after iteration 20: 0.222917750478.
   CONTEXT:  PL/Python function "madlib_keras_fit"
   INFO:  Validation set metric after iteration 20: 0.966666638851.
   CONTEXT:  PL/Python function "madlib_keras_fit"
   INFO:  Validation set loss after iteration 20: 0.176595017314.
   CONTEXT:  PL/Python function "madlib_keras_fit"
   ```
   
   
   (2)
   evaluate the training set
   
   ```
   DROP TABLE IF EXISTS iris_validate;
   
   SELECT madlib.madlib_keras_evaluate('iris_model',    -- model
                                      'iris_train_packed',     -- training table
                                      'iris_validate'  -- output table
                                      );
   SELECT * FROM iris_validate;
          loss        |      metric       | metrics_type
   -------------------+-------------------+--------------
    0.222917750477791 | 0.975000023841858 | {accuracy}
   (1 row)
   ```
   
   OK, evaluate loss and accuracy match the fit
   
   now manually count accuracy via predict:
   
   ```
   DROP TABLE IF EXISTS iris_predict;
   
   SELECT madlib.madlib_keras_predict('iris_model', -- model
                                      'iris_train',  -- train_table
                                      'id',  -- id column
                                      'attributes', -- independent var
                                      'iris_predict'  -- output table
                                      );
   
   SELECT round(count(*)*100/(150.0*0.8),2) as train_accuracy_percent from
       (select iris_train.class_text as actual, iris_predict.estimated_class_text as estimated
        from iris_predict inner join iris_train
        on iris_train.id=iris_predict.id) q
   WHERE q.actual=q.estimated;
   
    train_accuracy_percent
   ------------------------
                     97.50
   ```
   
   OK, evaluate loss and accuracy match predict
   
   
   (3)
   evaluate the test set
   
   ```
   DROP TABLE IF EXISTS iris_validate;
   
   SELECT madlib.madlib_keras_evaluate('iris_model',    -- model
                                      'iris_test_packed',     -- training table
                                      'iris_validate'  -- output table
                                      );
          loss        |      metric       | metrics_type
   -------------------+-------------------+--------------
    0.176595017313957 | 0.966666638851166 | {accuracy}
   
   (1 row)
   ```
   
   OK, evaluate loss and accuracy match the fit above
   
   now manually count accuracy via predict:
   
   ```
   DROP TABLE IF EXISTS iris_predict;
   
   SELECT madlib.madlib_keras_predict('iris_model', -- model
                                      'iris_test',  -- test_table
                                      'id',  -- id column
                                      'attributes', -- independent var
                                      'iris_predict'  -- output table
                                      );
   
   SELECT round(count(*)*100/(150.0*0.2),2) as test_accuracy_percent from
       (select iris_test.class_text as actual, iris_predict.estimated_class_text as estimated
        from iris_predict inner join iris_test
        on iris_test.id=iris_predict.id) q
   WHERE q.actual=q.estimated;
   
    test_accuracy_percent
   -----------------------
                    96.67
   (1 row)
   ```
   OK, evaluate loss and accuracy match predict
   
   
   (4)
   pass in table that has not been mini-batched
   
   ```
   DROP TABLE IF EXISTS iris_validate;
   
   SELECT madlib.madlib_keras_evaluate('iris_model',    -- model
                                      'iris_test',     -- test table
                                      'iris_validate'  -- output table
                                      );
   WARNING:  column "independent_var" does not exist
   CONTEXT:  PL/Python function "madlib_keras_evaluate"
   ERROR:  plpy.Error: madlib_keras_evaluate error: invalid independent_varname ('independent_var') for test table (iris_test). (plpython.c:5038)
   CONTEXT:  Traceback (most recent call last):
     PL/Python function "madlib_keras_evaluate", line 21, in <module>
       return madlib_keras.evaluate(**globals())
     PL/Python function "madlib_keras_evaluate", line 547, in evaluate
     PL/Python function "madlib_keras_evaluate", line 155, in __init__
     PL/Python function "madlib_keras_evaluate", line 99, in __init__
     PL/Python function "madlib_keras_evaluate", line 158, in _validate_input_args
     PL/Python function "madlib_keras_evaluate", line 107, in _validate_input_args
     PL/Python function "madlib_keras_evaluate", line 132, in _validate_test_tbl_cols
     PL/Python function "madlib_keras_evaluate", line 96, in _assert
   PL/Python function "madlib_keras_evaluate"
   ```
   
   This will be a common mistake, I'm not sure if the message above is the best.
   
   
   (5)
   ask for gpu when none avail
   
   ```
   DROP TABLE IF EXISTS iris_validate;
   
   SELECT madlib.madlib_keras_evaluate('iris_model',    -- model
                                      'iris_test_packed',     -- training table
                                      'iris_validate',  -- output table
                                      1 -- gpus per host
                                      );
   
   WARNING:  The number of gpus per host is less than the number of segments per host. The support for this case is experimental and it may fail.
   CONTEXT:  PL/Python function "madlib_keras_evaluate"
   ERROR:  plpy.SPIError: tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation Adam/iterations: node Adam/iterations (defined at /home/gpadmin/.local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py:402) was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:XLA_CPU:0 ]. Make sure the device specification refers to a valid device. The requested device appears to be a GPU, but CUDA is not enabled. (plpython.c:5038)  (seg1 slice1 10.128.0.41:40001 pid=20148) (plpython.c:5038)
   DETAIL:
   
   	 [[node Adam/iterations (defined at /home/gpadmin/.local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py:402) ]]
   
   Caused by op u'Adam/iterations', defined at:
     File "<string>", line 1, in <module>
     File "<string>", line 7, in __plpython_procedure_internal_keras_eval_transition_110195
     File "/home/gpadmin/madlib/build/src/ports/greenplum/5/modules/deep_learning/madlib_keras.py", line 644, in internal_keras_eval_transition
       serialized_weights)
     File "/home/gpadmin/madlib/build/src/ports/greenplum/5/modules/deep_learning/madlib_keras_wrapper.py", line 109, in compile_and_set_weights
       compile_model(segment_model, compile_params)
     File "/home/gpadmin/madlib/build/src/ports/greenplum/5/modules/deep_learning/madlib_keras_wrapper.py", line 309, in compile_model
       model.compile(**compile_dict)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/keras/engine/training.py", line 96, in compile
       self.optimizer = optimizers.get(optimizer)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/keras/optimizers.py", line 796, in get
       return deserialize(config)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/keras/optimizers.py", line 768, in deserialize
       printable_module_name='optimizer')
     File "/home/gpadmin/.local/lib/python2.7/site-packages/keras/utils/generic_utils.py", line 147, in deserialize_keras_object
       return cls.from_config(config['config'])
     File "/home/gpadmin/.local/lib/python2.7/site-packages/keras/optimizers.py", line 154, in from_config
       return cls(**config)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/keras/optimizers.py", line 462, in __init__
       self.iterations = K.variable(0, dtype='int64', name='iterations')
     File "/home/gpadmin/.local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 402, in variable
       v = tf.Variable(value, dtype=tf.as_dtype(dtype), name=name)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/tensorflow/python/ops/variables.py", line 213, in __call__
       return cls._variable_v1_call(*args, **kwargs)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/tensorflow/python/ops/variables.py", line 176, in _variable_v1_call
       aggregation=aggregation)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/tensorflow/python/ops/variables.py", line 155, in <lambda>
       previous_getter = lambda **kwargs: default_variable_creator(None, **kwargs)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 2495, in default_variable_creator
       expected_shape=expected_shape, import_scope=import_scope)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/tensorflow/python/ops/variables.py", line 217, in __call__
       return super(VariableMetaclass, cls).__call__(*args, **kwargs)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/tensorflow/python/ops/variables.py", line 1395, in __init__
       constraint=constraint)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/tensorflow/python/ops/variables.py", line 1531, in _init_from_args
       name=name)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/tensorflow/python/ops/state_ops.py", line 79, in variable_op_v2
       shared_name=shared_name)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/tensorflow/python/ops/gen_state_ops.py", line 1425, in variable_v2
       shared_name=shared_name, name=name)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
       op_def=op_def)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
       return func(*args, **kwargs)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op
       op_def=op_def)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1801, in __init__
       self._traceback = tf_stack.extract_stack()
   
   InvalidArgumentError (see above for traceback): Cannot assign a device for operation Adam/iterations: node Adam/iterations (defined at /home/gpadmin/.local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py:402) was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:XLA_CPU:0 ]. Make sure the device specification refers to a valid device. The requested device appears to be a GPU, but CUDA is not enabled.
   	 [[node Adam/iterations (defined at /home/gpadmin/.local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py:402) ]]
   Traceback (most recent call last):
     PL/Python function "internal_keras_eval_transition", line 6, in <module>
       return madlib_keras.internal_keras_eval_transition(**globals())
     PL/Python function "internal_keras_eval_transition", line 643, in internal_keras_eval_transition
     PL/Python function "internal_keras_eval_transition", line 111, in compile_and_set_weights
     PL/Python function "internal_keras_eval_transition", line 507, in set_weights
     PL/Python function "internal_keras_eval_transition", line 2469, in batch_set_value
     PL/Python function "internal_keras_eval_transition", line 198, in get_session
     PL/Python function "internal_keras_eval_transition", line 928, in run
     PL/Python function "internal_keras_eval_transition", line 1151, in _run
     PL/Python function "internal_keras_eval_transition", line 1327, in _do_run
     PL/Python function "internal_keras_eval_transition", line 1347, in _do_call
   PL/Python function "internal_keras_eval_transition"
   CONTEXT:  Traceback (most recent call last):
     PL/Python function "madlib_keras_evaluate", line 21, in <module>
       return madlib_keras.evaluate(**globals())
     PL/Python function "madlib_keras_evaluate", line 582, in evaluate
     PL/Python function "madlib_keras_evaluate", line 625, in get_loss_metric_from_keras_eval
   PL/Python function "madlib_keras_evaluate"
   ```
   
   Is this ^^^ the right message to throw and stack trace that we want?
   
   ```
   DROP TABLE IF EXISTS iris_validate;
   
   SELECT madlib.madlib_keras_evaluate('iris_model',    -- model
                                      'iris_test_packed',     -- training table
                                      'iris_validate',  -- output table
                                      5 -- gpus per host
                                      );
   
   ERROR:  plpy.SPIError: tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation Adam/iterations: node Adam/iterations (defined at /home/gpadmin/.local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py:402) was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:XLA_CPU:0 ]. Make sure the device specification refers to a valid device. The requested device appears to be a GPU, but CUDA is not enabled. (plpython.c:5038)  (seg0 slice1 10.128.0.41:40000 pid=20691) (plpython.c:5038)
   DETAIL:
   
   	 [[node Adam/iterations (defined at /home/gpadmin/.local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py:402) ]]
   
   Caused by op u'Adam/iterations', defined at:
     File "<string>", line 1, in <module>
     File "<string>", line 7, in __plpython_procedure_internal_keras_eval_transition_110195
     File "/home/gpadmin/madlib/build/src/ports/greenplum/5/modules/deep_learning/madlib_keras.py", line 644, in internal_keras_eval_transition
       serialized_weights)
     File "/home/gpadmin/madlib/build/src/ports/greenplum/5/modules/deep_learning/madlib_keras_wrapper.py", line 109, in compile_and_set_weights
       compile_model(segment_model, compile_params)
     File "/home/gpadmin/madlib/build/src/ports/greenplum/5/modules/deep_learning/madlib_keras_wrapper.py", line 309, in compile_model
       model.compile(**compile_dict)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/keras/engine/training.py", line 96, in compile
       self.optimizer = optimizers.get(optimizer)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/keras/optimizers.py", line 796, in get
       return deserialize(config)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/keras/optimizers.py", line 768, in deserialize
       printable_module_name='optimizer')
     File "/home/gpadmin/.local/lib/python2.7/site-packages/keras/utils/generic_utils.py", line 147, in deserialize_keras_object
       return cls.from_config(config['config'])
     File "/home/gpadmin/.local/lib/python2.7/site-packages/keras/optimizers.py", line 154, in from_config
       return cls(**config)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/keras/optimizers.py", line 462, in __init__
       self.iterations = K.variable(0, dtype='int64', name='iterations')
     File "/home/gpadmin/.local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 402, in variable
       v = tf.Variable(value, dtype=tf.as_dtype(dtype), name=name)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/tensorflow/python/ops/variables.py", line 213, in __call__
       return cls._variable_v1_call(*args, **kwargs)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/tensorflow/python/ops/variables.py", line 176, in _variable_v1_call
       aggregation=aggregation)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/tensorflow/python/ops/variables.py", line 155, in <lambda>
       previous_getter = lambda **kwargs: default_variable_creator(None, **kwargs)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 2495, in default_variable_creator
       expected_shape=expected_shape, import_scope=import_scope)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/tensorflow/python/ops/variables.py", line 217, in __call__
       return super(VariableMetaclass, cls).__call__(*args, **kwargs)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/tensorflow/python/ops/variables.py", line 1395, in __init__
       constraint=constraint)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/tensorflow/python/ops/variables.py", line 1531, in _init_from_args
       name=name)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/tensorflow/python/ops/state_ops.py", line 79, in variable_op_v2
       shared_name=shared_name)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/tensorflow/python/ops/gen_state_ops.py", line 1425, in variable_v2
       shared_name=shared_name, name=name)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
       op_def=op_def)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
       return func(*args, **kwargs)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op
       op_def=op_def)
     File "/home/gpadmin/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1801, in __init__
       self._traceback = tf_stack.extract_stack()
   
   InvalidArgumentError (see above for traceback): Cannot assign a device for operation Adam/iterations: node Adam/iterations (defined at /home/gpadmin/.local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py:402) was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:XLA_CPU:0 ]. Make sure the device specification refers to a valid device. The requested device appears to be a GPU, but CUDA is not enabled.
   	 [[node Adam/iterations (defined at /home/gpadmin/.local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py:402) ]]
   Traceback (most recent call last):
     PL/Python function "internal_keras_eval_transition", line 6, in <module>
       return madlib_keras.internal_keras_eval_transition(**globals())
     PL/Python function "internal_keras_eval_transition", line 643, in internal_keras_eval_transition
     PL/Python function "internal_keras_eval_transition", line 111, in compile_and_set_weights
     PL/Python function "internal_keras_eval_transition", line 507, in set_weights
     PL/Python function "internal_keras_eval_transition", line 2469, in batch_set_value
     PL/Python function "internal_keras_eval_transition", line 198, in get_session
     PL/Python function "internal_keras_eval_transition", line 928, in run
     PL/Python function "internal_keras_eval_transition", line 1151, in _run
     PL/Python function "internal_keras_eval_transition", line 1327, in _do_run
     PL/Python function "internal_keras_eval_transition", line 1347, in _do_call
   PL/Python function "internal_keras_eval_transition"
   CONTEXT:  Traceback (most recent call last):
     PL/Python function "madlib_keras_evaluate", line 21, in <module>
       return madlib_keras.evaluate(**globals())
     PL/Python function "madlib_keras_evaluate", line 582, in evaluate
     PL/Python function "madlib_keras_evaluate", line 625, in get_loss_metric_from_keras_eval
   PL/Python function "madlib_keras_evaluate"
   ```
   
   Is this ^^^ the right message to throw and stack trace that we want?
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services