You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Danil Kirsanov <da...@gmail.com> on 2016/09/13 23:45:59 UTC

Nominal Attribute

NominalAttribute in MLib is used to represent categorical data internally. 
It is barely documented though and has a number of limitations: for example,
it supports only integer and string data. 
Is there any current effort to expose it (and categorical data handling in
general) to the users, or is it intended to be an internal MLib data
representation only?

Thank you,
Danil



--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Nominal-Attribute-tp18935.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Re: Nominal Attribute

Posted by Joseph Bradley <jo...@databricks.com>.
There are plans...but not concrete ones yet:
https://issues.apache.org/jira/browse/SPARK-8515
I agree categorical data handling is a pain point and that we need to
improve it!

On Tue, Sep 13, 2016 at 4:45 PM, Danil Kirsanov <da...@gmail.com>
wrote:

> NominalAttribute in MLib is used to represent categorical data internally.
> It is barely documented though and has a number of limitations: for
> example,
> it supports only integer and string data.
> Is there any current effort to expose it (and categorical data handling in
> general) to the users, or is it intended to be an internal MLib data
> representation only?
>
> Thank you,
> Danil
>
>
>
> --
> View this message in context: http://apache-spark-
> developers-list.1001551.n3.nabble.com/Nominal-Attribute-tp18935.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>
>