madlib-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From LUYAO CHEN <luyao_c...@hotmail.com>
Subject Re: Learning with sparse vector format data
Date Mon, 23 Jul 2018 19:25:21 GMT
Thank you.


________________________________
From: Nikhil Kak <nkak@pivotal.io>
Sent: Friday, July 20, 2018 4:56 PM
To: user@madlib.apache.org
Subject: Re: Learning with sparse vector format data

Hi Luyao,

Thanks for trying out MADlib. Most of the modules including logistic regression do not support
sparse vector columns. However kmeans http://madlib.apache.org/docs/latest/group__grp__lda.html
does support it.
MADlib: Latent Dirichlet Allocation<http://madlib.apache.org/docs/latest/group__grp__lda.html>
madlib.apache.org
Latent Dirichlet Allocation (LDA) is a generative probabilistic model for natural texts. It
is used in problems such as automated topic discovery, collaborative filtering, and document
classification.



Let us know if you have more questions.

Thanks,
Nikhil Kak

On Thu, Jul 19, 2018 at 11:47 AM LUYAO CHEN <luyao_chen@hotmail.com<mailto:luyao_chen@hotmail.com>>
wrote:


Hi MADlib User Community,


I am new for MADlib. I have a question regarding the data in sparse vector format -  Can I
run the learning in sparse vector format?

For example, logistic regression. Seem the parameters assume that the data was stored in the
table.

In my scenario, I have 10 thousand if features, so that store them in the sparse vector format
would be a better solution.



Thanks,

Luyao


Mime
View raw message