madlib-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Frank McQuillan <fmcquil...@pivotal.io>
Subject Re: Regarding kmean error in Madlib
Date Mon, 01 May 2017 18:50:14 GMT
copying user@

Vinit,

You are missing outer quotes around the column name, since you have a case
sensitive column name, as per PostgreSQL rules.  Try:

SELECT * FROM madlib.kmeanspp('madlib.sample_sordetail', ' "MPrice" ', 2,
                           'madlib.squared_dist_norm2',
                           'madlib.avg', 20, 0.001);

In the future please post questions to the user mailing list
https://mail-archives.apache.org/mod_mbox/incubator-madlib-user/201704.mbox/browser
so others can participate too.

Regards,
Frank

On Mon, May 1, 2017 at 3:29 AM, Vinit Mahiwal <vmahiwal@gmail.com> wrote:

> Hi,
>
> I am trying to run madlib's kmean clustering on a retail data. I am using
> pivotal greenplum and aginity to query the data
>
> *SQL  :- *
>
> CREATE TABLE km_result AS
> SELECT * FROM madlib.kmeanspp('madlib.sample_sordetail', "MPrice", 2,
>                            'madlib.squared_dist_norm2',
>                            'madlib.avg', 20, 0.001);
>
> table is sample_sordetail in the schema madlib
> madlib version 1.10
>  Error I am getting  - "ERROR: 42703: column "MPrice" does not exist
>
> Also tried running *using PivotalR, madlib version 1.10*
>
> *fit <- madlib.kmeans(x$MPrice, centers =2,  key = 'ID' , iter.max = 10,
> algorithm = "Loyd" )*
> Executing in database connection 1:
>
> CREATE TABLE __madlib_temp_kmeans__1__ AS SELECT * FROM
> madlib.kmeans_random('select "MPrice" as "MPrice" from
> "madlib"."sample_sordetail"','MPrice',2,'madlib.squared_
> dist_norm2','madlib.avg',10,0.001)
>
> Error in db.q(sql_i, nrows = -1, conn.id = conn.id, verbose = FALSE) :
>   RS-DBI driver: (could not Retrieve the result : ERROR:  plpy.Error:
> kmeans error: Data table does not exist! (plpython.c:4648)
> CONTEXT:  Traceback (most recent call last):
>   PL/Python function "__kmeans_validate_src", line 23, in <module>
>     return kmeans.kmeans_validate_src(**globals())
>   PL/Python function "__kmeans_validate_src", line 34, in
> kmeans_validate_src
> PL/Python function "__kmeans_validate_src"
> SQL statement "SELECT  madlib.__kmeans_validate_src( $1 )"
> PL/pgSQL function "kmeans_random_seeding" line 14 at perform
> SQL statement "SELECT  madlib.kmeans(  $1 ,  $2 ,
> madlib.kmeans_random_seeding( $1 ,  $2 ,  $3 ),  $4 ,  $5 ,  $6 ,  $7 )"
> PL/pgSQL function "kmeans_random" line 4 at assignment
> )
> In addition: Warning message:
> In .validate.input(x, iter.max, nstart, algorithm) :
>   madlib.kmeans algorithm has to be a Lloydalgorithm is not!
>
> I tried first madlib 1.9.1 got the same issue., Can you please help to
> debug this issue.
>
> Thanks & Regards,
> Vinit Mahiwal
>
>
>
>
>
>
>
>
>
>
>

Mime
View raw message