madlib-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Frank McQuillan <fmcquil...@pivotal.io>
Subject Re: Spark related question
Date Mon, 01 Feb 2016 17:21:04 GMT
MADlib and SparkML are both machine learning libraries, but they run on
different engines.

MADlib runs in-database in Greenplum Database, Apache HAWQ and Postgres.

SparkML runs on Spark execution framework.

Currently, there is no integration between MADlib and SparkML.

Frank

On Mon, Feb 1, 2016 at 9:17 AM, Liang Quan <quanliang@gatech.edu> wrote:

> Thanks for the reply, Gautam,
>
> That's what I suspected as well but would like to confirm. Thank you.
>
> Yes, I was reading some Spark tutorials and the recommended links led me
> to the example in question, hence the title. How closely is Madlib
> associated with Spark then?
>
> Regards,
>
> On Sun, Jan 31, 2016 at 9:19 PM, Gautam Muralidhar <
> gautam.s.muralidhar@gmail.com> wrote:
>
>> Hi Liang,
>>
>> Thank you for your interest in MADlib.
>>
>> Step 4 gives you the per topic word distribution, i.e., the probability
>> of the word 'w' occurring in topic 'k'. Every topic is a distribution over
>> words and this step gives you the distribution for each of the topics.
>>
>> Best,
>> Gautam
>>
>> P.S: the subject line says Spark related question. I am assuming the
>> subject line was copied from a different thread by mistake.
>>
>> Sent from my iPhone
>>
>> On Jan 31, 2016, at 7:10 PM, Liang Quan <quanliang@gatech.edu> wrote:
>>
>> To whom this may concern,
>>
>> I'm a new subscriber of Madlib. First please allow me to extend my
>> appreciation for what you guys have accomplished. Madlib has a very
>> user-friendly and accessible interface for entry-level users. In addition,
>> I have a question regarding the LDA function example in the link below,
>> http://doc.madlib.net/latest/group__grp__lda.html#examples
>>
>> How is the probability of the each word calculated by the LDA function in
>> Step 4 in the table below? The frequency at which it appears in the
>> document or something else? Your reply is much appreciated, thanks.
>>
>>  topicid | wordid |        prob        |       word
>> ---------+--------+--------------------+-------------------
>>        1 |     69 |  0.181900726392252 | of
>>        1 |     52 | 0.0608353510895884 | is
>>        1 |     65 | 0.0608353510895884 | models
>>        1 |     30 | 0.0305690072639225 | corpora
>>        1 |      1 | 0.0305690072639225 | 1960s
>>        1 |     57 | 0.0305690072639225 | latent
>>        1 |     35 | 0.0305690072639225 | diverse
>>        1 |     81 | 0.0305690072639225 | semantic
>>        1 |     19 | 0.0305690072639225 | between
>>        1 |     75 | 0.0305690072639225 | pitchers
>>        1 |     43 | 0.0305690072639225 | for
>>        1 |      6 | 0.0305690072639225 | also
>>        1 |     40 | 0.0305690072639225 | favor
>>        1 |     47 | 0.0305690072639225 | had
>>        1 |     28 | 0.0305690072639225 | computational
>>
>>
>> Regards,
>>
>> --
>>
>> Liang Quan, Ph. D.
>>
>> Advanced Write Head Technology, Western Digital Corporation
>>
>> 5601 Great Oaks Parkway
>>
>> San Jose, CA 95119-1003
>>
>> Office: (408)717-7451
>> http://www.linkedin.com/in/liangquan
>>
>>
>>
>>
>
>
> --
>
> Liang Quan, Ph. D.
>
> Advanced Write Head Technology, Western Digital Corporation
>
> 5601 Great Oaks Parkway
>
> San Jose, CA 95119-1003
>
> Office: (408)717-7451
> http://www.linkedin.com/in/liangquan
>
>
>
>

Mime
View raw message