You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@madlib.apache.org by "Frank McQuillan (JIRA)" <ji...@apache.org> on 2016/11/29 22:49:58 UTC

[jira] [Comment Edited] (MADLIB-1038) Improvements to encoding categorical variables

    [ https://issues.apache.org/jira/browse/MADLIB-1038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15706798#comment-15706798 ] 

Frank McQuillan edited comment on MADLIB-1038 at 11/29/16 10:49 PM:
--------------------------------------------------------------------

Updated the requirements doc, seems more or less complete for this go around. 

Please let me know if you see something that needs addressing.

Here is the old interface:

{code}
create_indicator_variables (
    	source_table,
    	output_table,
    	categorical_cols,
    	keep_null,		-- Optional
    	distributed_by		-- Optional
)
{code}

Here is the proposed new interface:

{code}
encode_categorical_variables (
    	source_table,
    	output_table,
        categorical_cols,
        categorical_cols_to_exclude,	-- Optional
        row_id,					-- Optional
        top,						-- Optional
        value_to_drop, 			-- Optional
    	keep_null,			        -- Optional
        array_output,			        -- Optional
        output_col_dictionary,  		-- Optional
        distributed_by				-- Optional
)
{code}



was (Author: fmcquillan):
Updated the requirements doc, seems more or less complete for this go around. 

Please let me know if you see something that needs addressing.

> Improvements to encoding categorical variables
> ----------------------------------------------
>
>                 Key: MADLIB-1038
>                 URL: https://issues.apache.org/jira/browse/MADLIB-1038
>             Project: Apache MADlib
>          Issue Type: Improvement
>          Components: Module: Utilities
>            Reporter: Frank McQuillan
>            Assignee: Rahul Iyer
>             Fix For: v1.10
>
>         Attachments: Encoding categorical variables requirements - 29 nov 2016.pdf
>
>
> For the module
> http://madlib.incubator.apache.org/docs/latest/group__grp__data__prep.html
> there are several improvements that can be made.
> Please see attached requirements document.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)