madlib-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Satoshi Nagayasu <sn...@uptime.jp>
Subject Re: Multinomial Regression: Failed with a msg "Hessian or gradient is not finite."
Date Sun, 09 Apr 2017 06:47:31 GMT
Neki-san,

>   > 1) [NG] MADLib 1.10(Compiling From Source) + PG 9.6 (+ Ubuntu 16.04)

I'm not familiar with Ubuntu, but now I doubt that the PostgreSQL binary
you used here might have some issues, because SEGFAULT often happens
in such situation.

Do you have any chance to build the PostgreSQL binary from the source code
by yourself, and try it?

Regards,

2017-04-06 17:28 GMT+09:00 Neki, Atsushi <neki.atsushi@jp.fujitsu.com>:
> Nagayasu-san,
>
>
>
>> Could you find something in your PostgreSQL server log?
>
> I attached log files that collected under the below situation.
>
>   > 1) [NG] MADLib 1.10(Compiling From Source) + PG 9.6 (+ Ubuntu 16.04)
>   >     : Failed. Same result with before.
>   >       As an additional information,
>   >       madlib install-check was failed on this environment.
>   >       It looks similar with the below.
>   >        https://issues.apache.org/jira/browse/MADLIB-1068
>
>   log_install-check_fail.txt:
>     Log when madlib install check failed at check_elastic_net.
>
>   log_multinom_fail.txt:
>     Log when madlib.multinorm failed.
>
>   In both of them, I can see the following message.
>
>   > 2017-04-06 07:53:03 UTC [12162-4] LOG:  server process (PID 12483) was terminated
by signal 11: Segmentation fault
>   > 2017-04-06 07:53:03 UTC [12162-6] LOG:  terminating any other active server processes
>   > 2017-04-06 07:53:03 UTC [12168-2] WARNING:  terminating connection because of
crash of another server process
>   > 2017-04-06 07:53:03 UTC [12168-3] DETAIL:  The postmaster has commanded this server
process to roll back the current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
>   > 2017-04-06 07:53:03 UTC [12168-4] HINT:  In a moment you should be able to reconnect
to the database and repeat your command.
>
>  According to HINT in log message,
>  I confirmed that other server had already been stopped.
>  And then I reconnected. But the situation didn't get better.
>
>   $ pg_lsclusters
>   Ver Cluster Port Status Owner    Data directory               Log file
>   9.4 main    5434 down   postgres /var/lib/postgresql/9.4/main /var/log/postgresql/postgresql-9.4-main.log
>   9.5 main    5432 down   postgres /var/lib/postgresql/9.5/main /var/log/postgresql/postgresql-9.5-main.log
>   9.6 main    5433 online postgres /var/lib/postgresql/9.6/main pg_log/postgresql-%Y-%m-%d_%H%M%S.log
>
>
>> And how did you install your PostgreSQL 9.6 on Ubuntu?
>> Built from the source by yourself, or used some packages?
>
> I used packages that provided from PostgreSQL Apt Repository.
>  http://askubuntu.com/questions/831292/how-to-install-postgresql-9-6-on-any-ubuntu-version
>
>
> Thank you,
> Atsushi
>
> -----Original Message-----
> From: Satoshi Nagayasu [mailto:snaga@uptime.jp]
> Sent: Thursday, April 6, 2017 2:57 PM
> To: user@madlib.incubator.apache.org
> Subject: Re: Multinomial Regression: Failed with a msg "Hessian or gradient is not finite."
>
> Hi Neki-san,
>
> I have tried to reproduce the situation with PG9.6 and MADlib 1.10.0 on CentOS 6.
> Unfortunately, I couldn't reproduce it (see below), and it looked working well on my
box.
>
> Could you find something in your PostgreSQL server log?
>
> And how did you install your PostgreSQL 9.6 on Ubuntu?
> Built from the source by yourself, or used some packages?
>
> Regards,
>
> ------------------------------------
> testdb=# select version();
>                                                  version
> ----------------------------------------------------------------------------------------------------------
>  PostgreSQL 9.6.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7
> 20120313 (Red Hat 4.4.7-17), 64-bit
> (1 row)
>
> testdb=# select madlib.version();
>
>                                         version
>
> ----------------------------------------------------------------------------------------------------------------------------------------------------------
> ----------------------------------------------------------------------------
>  MADlib version: 1.10.0, git revision: unknown, cmake configuration
> time: 2017年  4月  6日 木曜日 05:36:25 UTC, build type: RelWithDebInfo, build system:
> Linux-2.6.32-504.el6.x86_64, C compiler: gcc 4.4.7, C++ compiler: g++ 4.4.7
> (1 row)
>
> testdb=# create table mactbl_mini(
> testdb(#   ap1 integer,
> testdb(#   ap2 integer,
> testdb(#   ap3 integer,
> testdb(#   floor integer,
> testdb(#   id integer
> testdb(# );
> CREATE TABLE
> testdb=# INSERT INTO mactbl_mini VALUES
> testdb-#   (-90,-86,0,601,1),
> testdb-#   (-84,0,0,601,2),
> testdb-#   (-83,0,0,601,3),
> testdb-#   (0,-72,-84,601,6),
> testdb-#   (0,0,-89,602,7),
> testdb-#   (0,0,0,602,8),
> testdb-#   (0,-85,0,603,43);
> INSERT 0 7
> testdb=# SELECT madlib.multinom('mactbl_mini',
> testdb(#                        'mactbl_output',
> testdb(#                        'floor',
> testdb(#                        'ARRAY[1,
> testdb'#                        ap1,
> testdb'#                        ap2,
> testdb'#                        ap3]',
> testdb(#                        '601',
> testdb(#                        'logit');
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
> WARNING:  Hessian or gradient is not finite.
>  multinom
> ----------
>
> (1 row)
>
> testdb=# \x
> Expanded display is on.
> testdb=# select * from mactbl_output;
> -[ RECORD 1 ]------+----------------------------------------------------------------------
> category           | 602
> coef               |
> {98.501202727275,1.66424014822047,1.23992159121658,0.719054608268981}
> log_likelihood     | NaN
> std_err            | {NaN,NaN,NaN,NaN}
> z_stats            | {NaN,NaN,NaN,NaN}
> p_values           | {NaN,NaN,NaN,NaN}
> num_rows_processed | 7
> num_rows_skipped   | 0
> num_iterations     | 100
> -[ RECORD 2 ]------+----------------------------------------------------------------------
> category           | 603
> coef               |
> {6715539712038.1,-52386230744.054,-483412078248451,-563980757956539}
> log_likelihood     | NaN
> std_err            | {NaN,NaN,NaN,NaN}
> z_stats            | {NaN,NaN,NaN,NaN}
> p_values           | {NaN,NaN,NaN,NaN}
> num_rows_processed | 7
> num_rows_skipped   | 0
> num_iterations     | 100
>
> testdb=#
> ------------------------------------
>
> 2017-04-06 13:50 GMT+09:00 Neki, Atsushi <neki.atsushi@jp.fujitsu.com>:
>> Frank,
>>
>> I tried again the following environments:
>>
>> 1) [NG] MADLib 1.10(Compiling From Source) + PG 9.6 (+ Ubuntu 16.04)
>>     : Failed. Same result with before.
>>       As an additional information,
>>       madlib install-check was failed on this environment.
>>       It looks similar with the below.
>>        https://issues.apache.org/jira/browse/MADLIB-1068
>>
>> 2) [OK] MADLib 1.9.1(Binary) + PG 9.5 (+ Ubuntu 16.04)
>>     : Successed.
>>       Still remain the msg 'WARNING:  Hessian or gradient is not finite.'
>>
>> 3) [NT] MADLib 1.10(Compiling From Source)  + PG 9.4 (+ Ubuntu 16.04)
>>     : MADLib Build error (so far, I will not report more about that in here)
>>     > [ 48%] Building CXX object src/ports/postgres/9.6/CMakeFiles/madlib_postgresql_9_6.dir/__/__/__/modules/linalg/matrix_decomp.cpp.o
>>     > g++: internal compiler error: Killed (program cc1plus)
>>
>> 4) [OK] MADLib 1.10(Compiling From Source)  + PG 9.6 (+ CentOS 7)
>>     : Successed.
>>       Still remain the msg 'WARNING:  Hessian or gradient is not finite.'
>>
>>
>> I will move my workspace to the 4th environment (1.10 + PG9.6 + CentOS).
>> Thank you very much for your quick response and advice.
>>
>>
>> Does the msg 'WARNING:  Hessian or gradient is not finite.' mean that something is
wrong?
>> Should I care something about it?
>> In case I see the message, some NaN values are included in output.
>> This model can predict category well.
>>
>>
>>   # select * from mactbl_output;
>>   -[ RECORD 1 ]------+-----------------------------------------------------------------------
>>   category           | 602
>>   coef               | {104.464636547264,1.72398814787594,1.29895329386266,0.688758979636415}
>>   log_likelihood     | NaN
>>   std_err            | {NaN,NaN,NaN,NaN}
>>   z_stats            | {NaN,NaN,NaN,NaN}
>>   p_values           | {NaN,NaN,NaN,NaN}
>>   num_rows_processed | 7
>>   num_rows_skipped   | 0
>>   num_iterations     | 100
>>   -[ RECORD 2 ]------+-----------------------------------------------------------------------
>>   category           | 603
>>   coef               | {-572252710812.094,4826990013.37584,41195201231745.7,48061068103698.5}
>>   log_likelihood     | NaN
>>   std_err            | {NaN,NaN,NaN,NaN}
>>   z_stats            | {NaN,NaN,NaN,NaN}
>>   p_values           | {NaN,NaN,NaN,NaN}
>>   num_rows_processed | 7
>>   num_rows_skipped   | 0
>>   num_iterations     | 100
>>
>>
>>
>> Thank you,
>> Atsushi
>>
>>
>>
>> From: Frank McQuillan [mailto:fmcquillan@pivotal.io]
>> Sent: Thursday, April 6, 2017 2:21 AM
>> To: user@madlib.incubator.apache.org
>> Subject: Re: Multinomial Regression: Failed with a msg "Hessian or gradient is not
finite."
>>
>> Atsushi-san,
>>
>> The error that you see makes me think that you lost connection to the database in
the middle of the query:
>>
>> "server closed the connection unexpectedly
>>         This probably means the server terminated abnormally
>>         before or while processing the request.
>> The connection to the server was lost. Attempting reset: Failed."
>>
>> I ran the same query with MADlib 1.10 on PG 9.4 and it seemed to work fine for me:
>>
>> madlib=# SELECT * FROM mactbl_mini;
>>  ap1 | ap2 | ap3 | floor | id
>> -----+-----+-----+-------+----
>>  -90 | -86 |   0 |   601 |  1
>>  -84 |   0 |   0 |   601 |  2
>>  -83 |   0 |   0 |   601 |  3
>>    0 | -72 | -84 |   601 |  6
>>    0 |   0 | -89 |   602 |  7
>>    0 |   0 |   0 |   602 |  8
>>    0 | -85 |   0 |   603 | 43
>> (7 rows)
>>
>>
>> madlib=# SELECT madlib.multinom('mactbl_mini',
>> madlib(#                      'mactbl_output',
>> madlib(#                       'floor',
>> madlib(#                       'ARRAY[1,
>> madlib'#                        ap1,
>> madlib'#                       ap2,
>> madlib'#                        ap3]',
>> madlib(#                        '601',
>> madlib(#                     'logit');
>>  multinom
>> ----------
>>
>> (1 row)
>>
>>
>> madlib=# \x on
>> Expanded display is on.
>> madlib=# SELECT * FROM mactbl_output;
>> -[ RECORD 1 ]------+--------------------------------------------------------------------------------------
>> category           | 602
>> coef               | {112.327782768804,2.47679877229691,1.69069062458081,0.841889190837338}
>> log_likelihood     | NaN
>> std_err            | {5.81382676773028e+16,8.77427991785089e+17,1.06802178555571e+15,1.67358709762743e+15}
>> z_stats            | {1.93207997514273e-15,2.82279434379335e-18,1.58301136497989e-15,5.0304474265537e-16}
>> p_values           | {0.999999999999998,1,0.999999999999999,1}
>> num_rows_processed | 7
>> num_rows_skipped   | 0
>> num_iterations     | 100
>> -[ RECORD 2 ]------+--------------------------------------------------------------------------------------
>> category           | 603
>> coef               | {32.633774536969,1.61185753977669,-0.177075877885652,1.8769521118073}
>> log_likelihood     | NaN
>> std_err            | {9.40692076915866e+16,5.45753786923624e+18,1.30468621462021e+15,1.1421554354824e+22}
>> z_stats            | {3.46912399262056e-16,2.95345186491992e-19,-1.35722962273498e-16,1.6433420999433e-22}
>> p_values           | {1,1,1,1}
>> num_rows_processed | 7
>> num_rows_skipped   | 0
>> num_iterations     | 100
>>
>>
>>
>> So please try again.
>>
>> Frank
>>
>> On Tue, Apr 4, 2017 at 7:28 PM, Neki, Atsushi <neki.atsushi@jp.fujitsu.com>
wrote:
>> Hello,
>>
>>
>> I detected a failure of computation when using madlib.multinom function.
>> Is there anything wrong with my procedure?
>>
>> ====================================
>>
>> - Issue description
>>
>> madlib.multinom failed with the following message:
>>
>>   > WARNING:  Hessian or gradient is not finite.
>>   > server closed the connection unexpectedly
>>   >     This probably means the server terminated abnormally
>>   >     before or while processing the request.
>>   > The connection to the server was lost. Attempting reset: Failed.
>>
>>
>> - Library
>>
>> This issue happens in the following environment:
>>  1. madlib 1.9.1 + PostgreSQL 9.5
>>  2. madlib 1.10  + PostgreSQL 9.6
>>
>>
>> - Expectation
>>
>> The result should be:
>>
>>  1. Success computing the model
>>  2. Success building a model that can predict output precisely.
>>     (Because this is quite simple use case)
>>
>>
>> - Procedure to reproduce this issue
>>
>> 1. Prepare the following table:
>>
>> testdb=# select * from mactbl_mini;
>>  ap1 | ap2 | ap3 | floor | id
>> -----+-----+-----+-------+----
>>  -90 | -86 |   0 | 601   |  1
>>  -84 |   0 |   0 | 601   |  2
>>  -83 |   0 |   0 | 601   |  3
>>    0 | -72 | -84 | 601   |  6
>>    0 |   0 | -89 | 602   |  7
>>    0 |   0 |   0 | 602   |  8
>>    0 | -85 |   0 | 603   | 43
>> (7 rows)
>>
>>
>> 2. Compute a model to predict floor with ap by using madlib.multinom
>>
>> testdb=# drop table mactbl_output;
>> DROP TABLE
>> testdb=# drop table mactbl_output_summary; DROP TABLE testdb=#
>> testdb=# SELECT madlib.multinom('mactbl_mini',
>> testdb(#                       'mactbl_output',
>> testdb(#                       'floor',
>> testdb(#                       'ARRAY[1,
>> testdb'#                       ap1,
>> testdb'#                       ap2,
>> testdb'#                       ap3]',
>> testdb(#                       '601',
>> testdb(#                       'logit');
>> WARNING:  Hessian or gradient is not finite.
>> server closed the connection unexpectedly
>>         This probably means the server terminated abnormally
>>         before or while processing the request.
>> The connection to the server was lost. Attempting reset: Failed.
>> !>
>>
>>
>> When I changed some parameter, it successed.
>>
>> - Set '602' or '603' to ref_category.   [Success]
>> - Change max_iter from 100(default) to 10.       [Success]
>>
>> ====================================
>>
>>
>> Thanks in advance.
>>
>> --
>>  Atsushi Neki
>>
>
>
>
> --
> Satoshi Nagayasu <snaga@uptime.jp>



-- 
Satoshi Nagayasu <snaga@uptime.jp>

Mime
View raw message