madlib-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jingyi Mei <j...@pivotal.io>
Subject [VOTE] MADlib v1.14-rc1
Date Thu, 26 Apr 2018 21:57:16 GMT
Hello Apache MADlib dev community,

This is the vote for Apache MADlib 1.14 Release (RC1). It provides the
source release tarball and convenience binaries. This is the third
Apache MADlib release as an Apache Top Level Project (TLP).

The vote will run for at least 72 working hours and will close on
Tuesday, May 1st, 2018 @ 6pm PDT. A minimum of 3 binding +1 votes and
more binding +1 than binding -1 are required to pass.

The main goals of this release are:

New features:

   - New module - Balanced datasets: A sampling module to balance
   classification
   datasets by resampling using various techniques including undersampling,
   oversampling, uniform sampling or user-defined proportion sampling
   (MADLIB-1168)
   - Mini-batch: Added a mini-batch optimizer for MLP and a preprocessor
   function
   necessary to create batches from the data (MADLIB-1200, MADLIB-1206,
   MADLIB-1220, MADLIB-1224, MADLIB-1226, MADLIB-1227)
   - k-NN: Added weighted averaging/voting by distance (MADLIB-1181)
   - Summary: Added additional stats: number of positive, negative, zero
   values and
   95% confidence intervals for the mean (MADLIB-1167)
   - Encode categorical: Updated to produce lower-case column names when
   possible
   (MADLIB-1202)
   - MLP: Added support for already one-hot encoded categorical dependent
   variable
   in a classification task (MADLIB-1222)
   - Pagerank: Added option for personalized vertices that allows higher
   weightage
   for a subset of vertices which will have a higher jump probability as
   compared to other vertices and a random surfer is more likely to
   jump to these personalization vertices (MADLIB-1084)

Bug fixes:

   - Fixed issue with invalid calls of construct_array that led to problems
   in Postgresql 10 (MADLIB-1185)
   - Added newline between file concatenation during PGXN install
   (MADLIB-1194)
   - Fixed upgrade issues in knn (MADLIB-1197)
   - Added fix to ensure RF variable importance are always non-negative
   - Fixed inconsistency in LDA output and improved usability (MADLIB-1160,
   MADLIB-1201)
   - Fixed MLP and RF predict for models trained in earlier versions to
   ensure missing optional parameters are given appropriate default values
   (MADLIB-1207)
   - Fixed a scenario in DT where no features exist due categorical columns
   with single level being dropped led to the database crashing
   - Fixed step size initialization in MLP based on learning rate policy
   (MADLIB-1212)
   - Fixed PCA issue that leads to failure when grouping column is a TEXT
   type (MADLIB-1215)
   - Fixed cat levels output in DT when grouping is enabled (MADLIB-1218)
   - Fixed and simplified initialization of model coefficients in MLP
   - Removed source table dependency for predicting regression models in
   MLP (MADLIB-1223)
   - Print loss of first iteration in MLP (MADLIB-1228)
   - Fixed MLP failure on GPDB 4.3 when verbose=True (MADLIB-1209)
   - Fixed RF issue that showed up when var_importance=True with no
   continuous features (MADLIB-1219)
   - Fixed DT/RF issue for null_as_category=True and grouping enabled
   (MADLIB-1217)

Other:

   - Reduced install-check runtime for PCA, DT, RF, elastic net
   (MADLIB-1216)
   - Added CentOS 7 PostgreSQL 9.6/10 docker files

For additional information, please see:
https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.14

Here are the release artifact details:

Source release tag to be voted on: rc/1.14-rc1, located here:
https://git-wip-us.apache.org/repos/asf?p=madlib.git;a=tag;h=refs/tags/rc/1.14-rc1

Source release tarball can be retrieved from the following locations:

Package:
https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/apache-madlib-1.14-src.tar.gz
PGP Signature:
https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/apache-madlib-1.14-src.tar.gz.asc
SHA512 Hash:
https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/apache-madlib-1.14-src.tar.gz.sha512

Convenience binary packages can be retrieved from the following
locations:

macOS: 10.* PostgreSQL 9.6 & 10.2

Package:
https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/apache-madlib-1.14-bin-Darwin.dmg
PGP Signature:
https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/apache-madlib-1.14-bin-Darwin.dmg.asc
SHA512 Hash:
https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/apache-madlib-1.14-bin-Darwin.dmg.sha512

CentOS* GPDB 4.3.5+

Package:
https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/apache-madlib-1.14-bin-Linux-GPDB43.rpm
PGP Signature:
https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/apache-madlib-1.14-bin-Linux-GPDB43.rpm.asc
SHA512 Hash:
https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/apache-madlib-1.14-bin-Linux-GPDB43.rpm.sha512

CentOS 6 &* GPDB 5.3.0, PostgreSQL 9.6 & 10.2

Package:
https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/apache-madlib-1.14-bin-Linux.rpm
PGP Signature:
https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/apache-madlib-1.14-bin-Linux.rpm.asc
SHA512 Hash:
https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/apache-madlib-1.14-bin-Linux.rpm.sha512

The PGP KEYS file used to validate the signature of the release artifacts
is available here:
https://dist.apache.org/repos/dist/dev/madlib/KEYS

To help in tallying the vote, PMC members please be sure to indicate
“(binding)” with the vote.

[ ] +1 approve
[ ] +0 no opinion
[ ] -1 disapprove (and reason why)

Regards,
Jingyi Mei

Pivotal R&D Advanced Analytics
​

Mime
View raw message