madlib-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Liang Quan <quanli...@gatech.edu>
Subject Spark related question
Date Mon, 01 Feb 2016 03:10:20 GMT
To whom this may concern,

I'm a new subscriber of Madlib. First please allow me to extend my
appreciation for what you guys have accomplished. Madlib has a very
user-friendly and accessible interface for entry-level users. In addition,
I have a question regarding the LDA function example in the link below,
http://doc.madlib.net/latest/group__grp__lda.html#examples

How is the probability of the each word calculated by the LDA function in
Step 4 in the table below? The frequency at which it appears in the
document or something else? Your reply is much appreciated, thanks.

 topicid | wordid |        prob        |       word
---------+--------+--------------------+-------------------
       1 |     69 |  0.181900726392252 | of
       1 |     52 | 0.0608353510895884 | is
       1 |     65 | 0.0608353510895884 | models
       1 |     30 | 0.0305690072639225 | corpora
       1 |      1 | 0.0305690072639225 | 1960s
       1 |     57 | 0.0305690072639225 | latent
       1 |     35 | 0.0305690072639225 | diverse
       1 |     81 | 0.0305690072639225 | semantic
       1 |     19 | 0.0305690072639225 | between
       1 |     75 | 0.0305690072639225 | pitchers
       1 |     43 | 0.0305690072639225 | for
       1 |      6 | 0.0305690072639225 | also
       1 |     40 | 0.0305690072639225 | favor
       1 |     47 | 0.0305690072639225 | had
       1 |     28 | 0.0305690072639225 | computational


Regards,

-- 

Liang Quan, Ph. D.

Advanced Write Head Technology, Western Digital Corporation

5601 Great Oaks Parkway

San Jose, CA 95119-1003

Office: (408)717-7451
http://www.linkedin.com/in/liangquan

Mime
View raw message