You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@madlib.apache.org by LUYAO CHEN <lu...@hotmail.com> on 2018/07/19 18:46:54 UTC

Re: Learning with sparse vector format data

Hi MADlib User Community,


I am new for MADlib. I have a question regarding the data in sparse vector format -  Can I run the learning in sparse vector format?

For example, logistic regression. Seem the parameters assume that the data was stored in the table.

In my scenario, I have 10 thousand if features, so that store them in the sparse vector format would be a better solution.



Thanks,

Luyao


Re: Learning with sparse vector format data

Posted by LUYAO CHEN <lu...@hotmail.com>.
Thank you.


________________________________
From: Nikhil Kak <nk...@pivotal.io>
Sent: Friday, July 20, 2018 4:56 PM
To: user@madlib.apache.org
Subject: Re: Learning with sparse vector format data

Hi Luyao,

Thanks for trying out MADlib. Most of the modules including logistic regression do not support sparse vector columns. However kmeans http://madlib.apache.org/docs/latest/group__grp__lda.html does support it.
MADlib: Latent Dirichlet Allocation<http://madlib.apache.org/docs/latest/group__grp__lda.html>
madlib.apache.org
Latent Dirichlet Allocation (LDA) is a generative probabilistic model for natural texts. It is used in problems such as automated topic discovery, collaborative filtering, and document classification.



Let us know if you have more questions.

Thanks,
Nikhil Kak

On Thu, Jul 19, 2018 at 11:47 AM LUYAO CHEN <lu...@hotmail.com>> wrote:


Hi MADlib User Community,


I am new for MADlib. I have a question regarding the data in sparse vector format -  Can I run the learning in sparse vector format?

For example, logistic regression. Seem the parameters assume that the data was stored in the table.

In my scenario, I have 10 thousand if features, so that store them in the sparse vector format would be a better solution.



Thanks,

Luyao


Re: Learning with sparse vector format data

Posted by Nikhil Kak <nk...@pivotal.io>.
Hi Luyao,

Thanks for trying out MADlib. Most of the modules including logistic
regression do not support sparse vector columns. However kmeans
http://madlib.apache.org/docs/latest/group__grp__lda.html does support it.

Let us know if you have more questions.

Thanks,
Nikhil Kak

On Thu, Jul 19, 2018 at 11:47 AM LUYAO CHEN <lu...@hotmail.com> wrote:

>
> Hi MADlib User Community,
>
>
> I am new for MADlib. I have a question regarding the data in sparse vector
> format -  Can I run the learning in sparse vector format?
>
> For example, logistic regression. Seem the parameters assume that the data
> was stored in the table.
>
> In my scenario, I have 10 thousand if features, so that store them in the
> sparse vector format would be a better solution.
>
>
>
> Thanks,
>
> Luyao
>
>