You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@madlib.apache.org by "Domino Valdano (JIRA)" <ji...@apache.org> on 2019/04/26 19:20:00 UTC
[jira] [Updated] (MADLIB-1332) DL: Support mini-batched validation data for fit/evaluate

     [ https://issues.apache.org/jira/browse/MADLIB-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Domino Valdano updated MADLIB-1332:
-----------------------------------
    Description: 
Currently, {{keras_evaluate()}} is implemented by calling {{internal_keras_evaluate()}} as a UDF.  This requires the validation table passed to {{keras_fit()}} to be in a format with only 1 image per row, even though the training table is in a different format, with a batch of images in every row.  This is potentially confusing and cumbersome for users to deal with, and based on some preliminary testing it seems that passing only 1 image at a time to keras_evaluate() is also slowing down performance.

We can solve this by converting {{internal_keras_evaluate()}} into a UDA, so that it runs on a minibatched validation table in the same form as the training table.

 

Tasks:
 * Convert the {{internal_keras_evaluate}} UDF to a UDA and perform weighted averaging of loss and accuracy.
 * Since x and y will now be minibatched, we don't need to add another dimension to {{x and y}} np arrays in {{internal_keras_evaluate}}.
 * Compare UDF to UDA and verify that the UDA results in a speed improvement

 

  was:
Currently, `keras_evaluate()` is implemented by calling `internal_keras_evaluate() as a UDF.  This requires the validation table passed to `keras_fit()` to be in a format with only 1 image per row, even though the training table is in a different format, with a batch of images in every row.  This is potentially confusing and cumbersome for users to deal with, and based on some preliminary testing it seems that passing only 1 image at a time to `keras_evaluate()` is also slowing down performance.

We can solve this by converting `internal_keras_evaluate()` into a UDA, so that it runs on a minibatched validation table in the same form as the training table.

 

Tasks:
 * Convert the {{internal_keras_evaluate}} UDF to a UDA and perform weighted averaging of loss and accuracy.
 * Since x and y will now be minibatched, we don't need to add another dimension to {{x and y}} np arrays in {{internal_keras_evaluate}}.
 * Compare UDF to UDA and verify that the UDA results in a speed improvement

 


> DL: Support mini-batched validation data for fit/evaluate
> ---------------------------------------------------------
>
>                 Key: MADLIB-1332
>                 URL: https://issues.apache.org/jira/browse/MADLIB-1332
>             Project: Apache MADlib
>          Issue Type: Improvement
>          Components: Deep Learning
>            Reporter: Domino Valdano
>            Priority: Major
>             Fix For: v1.16
>
>
> Currently, {{keras_evaluate()}} is implemented by calling {{internal_keras_evaluate()}} as a UDF.  This requires the validation table passed to {{keras_fit()}} to be in a format with only 1 image per row, even though the training table is in a different format, with a batch of images in every row.  This is potentially confusing and cumbersome for users to deal with, and based on some preliminary testing it seems that passing only 1 image at a time to keras_evaluate() is also slowing down performance.
> We can solve this by converting {{internal_keras_evaluate()}} into a UDA, so that it runs on a minibatched validation table in the same form as the training table.
>  
> Tasks:
>  * Convert the {{internal_keras_evaluate}} UDF to a UDA and perform weighted averaging of loss and accuracy.
>  * Since x and y will now be minibatched, we don't need to add another dimension to {{x and y}} np arrays in {{internal_keras_evaluate}}.
>  * Compare UDF to UDA and verify that the UDA results in a speed improvement
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)