You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@madlib.apache.org by "Orhan Kislal (JIRA)" <ji...@apache.org> on 2019/04/05 00:07:00 UTC

[jira] [Commented] (MADLIB-1317) Multinomial results not matching with R method

    [ https://issues.apache.org/jira/browse/MADLIB-1317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16810423#comment-16810423 ] 

Orhan Kislal commented on MADLIB-1317:
--------------------------------------

w/ [~khannaekta]
Hi Pratik,

We checked the output of multinom with a few sample datasets and they seemed to match the R output. 

The ref_category is an optional parameter. If you leave it as NULL, madlib should pick the first category (as the R function does). 

Could you try it without the ref_category and give us a sample dataset that exhibits this behavior?

> Multinomial results not matching with R method
> ----------------------------------------------
>
>                 Key: MADLIB-1317
>                 URL: https://issues.apache.org/jira/browse/MADLIB-1317
>             Project: Apache MADlib
>          Issue Type: Bug
>          Components: Module: Multinomial Logistic Regression
>            Reporter: Pratik
>            Priority: Major
>
> Hi team,
> I have using madlib multinomial method on my dataset with categorical independent variable (hot encoded) as below. 
>  
> {code:java}
> SELECT
>     CASE WHEN multinom IS NOT NULL THEN TRUE ELSE FALSE END
> FROM
>  madlib.multinom(
>     'TEMP_TEST_1',
>     'TEMP_TEST_1_OP',
>     'dep_var_col',
>     'ARRAY[ 1,hot_encoded_GENDER_col_val1, hot_encoded_GENDER_col_val2]',
>     '1',--REF CATEGORY 
>     'logit',
>     NULL,
>     'max_iter=100,optimizer=irls,tolerance=0.0001',
>     TRUE
>  );{code}
> Gender being a categorical column I am hot encoding it in 2 columns 0|1. 
> When comparing results with R's method coefficients match but the StdErr and pValue are way off in comparison.
> R method -
> {code:java}
> nnet::multinom
> {code}
>  
> Is there anything I need to do specially for multinom or is it a bug? 
> Or is there perticular way I need to use R to compare results with multinom?
> *UPDATE:*
> Is it mandatory to have ref_category like column for categorical independent variable?? 
> hot encoded GENDER_col_val1 from list of independent variable and results are matching with Rs output.
>  
> Is there any documentation or reference to confirm this? 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)