madlib-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gautam Muralidhar <gautam.s.muralid...@gmail.com>
Subject Re: Spark related question
Date Mon, 01 Feb 2016 05:19:21 GMT
Hi Liang,

Thank you for your interest in MADlib.

Step 4 gives you the per topic word distribution, i.e., the probability of the word 'w' occurring
in topic 'k'. Every topic is a distribution over words and this step gives you the distribution
for each of the topics.

Best,
Gautam 

P.S: the subject line says Spark related question. I am assuming the subject line was copied
from a different thread by mistake.

Sent from my iPhone

> On Jan 31, 2016, at 7:10 PM, Liang Quan <quanliang@gatech.edu> wrote:
> 
> To whom this may concern, 
> 
> I'm a new subscriber of Madlib. First please allow me to extend my appreciation for what
you guys have accomplished. Madlib has a very user-friendly and accessible interface for entry-level
users. In addition, I have a question regarding the LDA function example in the link below,
http://doc.madlib.net/latest/group__grp__lda.html#examples
> 
> How is the probability of the each word calculated by the LDA function in Step 4 in the
table below? The frequency at which it appears in the document or something else? Your reply
is much appreciated, thanks. 
> 
>  topicid | wordid |        prob        |       word
> ---------+--------+--------------------+-------------------
>        1 |     69 |  0.181900726392252 | of
>        1 |     52 | 0.0608353510895884 | is
>        1 |     65 | 0.0608353510895884 | models
>        1 |     30 | 0.0305690072639225 | corpora
>        1 |      1 | 0.0305690072639225 | 1960s
>        1 |     57 | 0.0305690072639225 | latent
>        1 |     35 | 0.0305690072639225 | diverse
>        1 |     81 | 0.0305690072639225 | semantic
>        1 |     19 | 0.0305690072639225 | between
>        1 |     75 | 0.0305690072639225 | pitchers
>        1 |     43 | 0.0305690072639225 | for
>        1 |      6 | 0.0305690072639225 | also
>        1 |     40 | 0.0305690072639225 | favor
>        1 |     47 | 0.0305690072639225 | had
>        1 |     28 | 0.0305690072639225 | computational
> 
> Regards, 
> 
> -- 
> Liang Quan, Ph. D.
> 
> Advanced Write Head Technology, Western Digital Corporation
> 
> 5601 Great Oaks Parkway
> 
> San Jose, CA 95119-1003
> 
> Office: (408)717-7451
> 
> http://www.linkedin.com/in/liangquan
> 
> 
> 

Mime
View raw message