madlib-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Frank McQuillan <fmcquil...@pivotal.io>
Subject Re: [VOTE] MADlib v1.17.0-rc2
Date Wed, 08 Apr 2020 22:21:46 GMT
+1 (binding)

tested on pg11.3 on osx
- passed install check, dev check, spot check of some new 1.17 functions

tested on gp5.18
- passed spot check of some new 1.17 functions

well done!

On Mon, Apr 6, 2020 at 4:49 PM Orhan Kislal <okislal@pivotal.io> wrote:

> Hello Apache MADlib community,
>
> This is the vote for Apache MADlib 1.17.0 Release (RC2). It provides the
> source release tarball and convenience binaries.
>
> We didn't hold a vote for RC1 because we discovered a minor issue before
> sending the vote.
>
> The vote will run for at least 72 hours and will close on Thursday,
> April 9, 2020 @ 23:59 UTC (16:59 PDT). A minimum of 3 binding +1 votes
> and more binding +1 than binding -1 are required to pass.
>
> The main goals of this release are:
>
> New features
>     - DL: Add optional params to madlib_keras_fit_multiple_model
> (MADLIB-1397)
>     - DL: Fit and evaluate changes for asymmetric cluster config
> (MADLIB-1393)
>     - DL: Make param search fit() function work with existing evaluate and
> predict (MADLIB-1387)
>     - DL: ParamSearch: Add utility function for generating model selection
> table (MADLIB-1375)
>     - DL: Predict changes for asymmetric cluster config (MADLIB-1394)
>     - DL: Preprocessor should evenly distribute data on an arbitrary number
> of segments (MADLIB-1378)
>     - DL: Preprocessor support for asymmetric segment distribution
> (MADLIB-1392)
>     - DL: Remove model_arch_table column from the output of
> load_model_selection_table (MADLIB-1381)
>     - DL: Support DL predict without training on MADlib (MADLIB-1359)
>     - DL: Transfer learning for multi-model (MADLIB-1389)
>     - Kmeans: Add simple silhouette score for every point (MADLIB-1382)
>     - Kmeans: Select number of centroids in k-means (MADLIB-1380)
>     - PostgreSQL 12 support (MADLIB-1391)
>
> Improvements:
>     - Assoc rules: Add option to set number of posterior in association
> rules (MADLIB-1327)
>     - Correlation: Improve correlation and covariance memory usage with
> large number of groups (MADLIB-1301)
>     - DL: helper function for asymmetric cluster config (MADLIB-1390)
>     - DL: Mini-batch preprocessor for images - performance issue
> (MADLIB-1342)
>     - DL: Modify warm start logic for DL to handle case of missing weight
> (MADLIB-1400)
>     - DL: Param search for multiple models on MPP architecture
> (MADLIB-1386)
>     - DL: performance improvements to fit transition function (MADLIB-1418)
>     - Docs: Enhance Installation Guides (MADLIB-1399)
>     - Graph: SSSP should not show vertices in output table that are
> unreachable (MADLIB-1415)
>     - Knn - add zero check and output distance array (MADLIB-1370)
>     - LDA: Add stopping criteria on perplexity to LDA (MADLIB-1351)
>     - Summary: Last optional param in summary errors when NULL
> (MADLIB-1413)
>     - Summary: Summary function has dups for MFV for approximate results
> (MADLIB-1412)
>     - SVM: Change default num_components for SVM to max(100,
> 2*num_features) (MADLIB-1384)
>
> Bug fixes:
>     - DL: Deep Learning module does not work with tables in non-public
> schemas (MADLIB-1388)
>     - DL: Exception during madlib_keras_fit when model_arch_id is passed as
> NULL (MADLIB-1371)
>     - DL: fit and fit multiple fail with memory exception in gpdb6
> (MADLIB-1405)
>     - DL: fit multiple takes up unnecessary disk space (MADLIB-1406)
>     - DL: Intermediate tables are not dropped  (MADLIB-1404)
>     - DL: MADlib Keras operations create too many threads (MADLIB-1372)
>     - DL: metrics_elapsed_time for fit multi_model not captured correctly
> (MADLIB-1403)
>     - DL: predict fails with OOM in gpdb6 (MADLIB-1414)
>     - DL: Remove final function for fit multiple (MADLIB-1416)
>     - DL: Support schema qualified output tables for fit and fit_multiple
> (MADLIB-1417)
>     - Graph: APSP fails if both vertex id column and edge src column has
> the same name (MADLIB-1407)
>     - Graph: ASPS Path Function fails if src or dest column type is bigint
> (MADLIB-1408)
>     - Graph: Graph/wcc fails if the user specifies a schema for the output
> table (MADLIB-1411)
>     - Kmeans: k-means related functions must use same default distance
> function (MADLIB-1383)
>     - LDA: Term frequency and LDA - turn off notices (MADLIB-1395)
>     - MADlib cannot be built on PowerPC machines with Linux (MADLIB-1410)
>     - Pivot:  Pivot documentation should say "out_table" instead of
> "output_table" (MADLIB-1376)
>
> Other:
>     - DL: Support up to Keras version 2.2.4, Tensorflow version 1.14
>     - DL: If 'madlib_keras_fit_multiple_model()' is running on GPDB 5 and
> some versions of GPDB 6, the database will keep adding to the disk space
> (in proportion to model size) and will only release the disk space once the
> fit multiple query has completed execution. This is not the case for GPDB
> 6.5.0+ where disk space is released during the fit multiple query.
>     - DL: CUDA GPU memory cannot be released until the process holding it
> is terminated.  This process holds the GPU memory until one of the
> following two things happen: query finishes and user logs out of the
> Postgres client/session; or, query finishes and user waits for the timeout
> set by `gp_vmem_idle_resource_timeout`. The default value for this timeout
> in Greenplum is 18 sec, but it can be changed.
>     - DL: pg_temp is not allowed as an output table schema for
> madlib_keras_fit_multiple_model().
>     - Build: Enable current versions of bison
>     - Build: Add cmake variable for gppkg filename
>     - Build: Add pull request template
>
> 1.17.0 docs available here:
> http://madlib.apache.org/docs/rc/index.html
>
> For additional information, please see:
> https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.17.0
>
> Here are the release artifact details:
>
> Source release tag to be voted on: rc/1.17.0-rc2, located here:
>
> https://git-wip-us.apache.org/repos/asf?p=madlib.git;a=tag;h=refs/tags/rc/1.17.0-rc2
>
> Source release tarball can be retrieved from the following locations:
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.asc
> SHA512
> <https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.ascSHA512>
> Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.sha512
>
> Convenience binary packages can be retrieved from the following
> locations:
>
> macOS: 10.14 GPDB 5.* & 6.*, PostgreSQL 11 & 12
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.asc
> SHA512
> <https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.ascSHA512>
> Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.sha512
>
> CentOS 5.* GPDB 4.3.5+ (compiled with gcc 4.4)
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.asc
> SHA512
> <https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.ascSHA512>
> Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.sha512
>
> CentOS 6 (tested on CentOS 7 as well), GPDB 5.* & 6.*, PostgreSQL 11 & 12
> (compiled with gcc 6.2)
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.asc
> SHA512
> <https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.ascSHA512>
> Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.sha512
>
> Ubuntu 18.04 GPDB 5.* & 6.*,PostgreSQL 11 & 12 (compiled with gcc 7.4)
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.asc
> SHA512
> <https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.ascSHA512>
> Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.sha512
>
> The PGP KEYS file used to validate the signature of the release artifacts
> is available here:
> https://dist.apache.org/repos/dist/dev/madlib/KEYS
>
> To help in tallying the vote, PMC members please be sure to indicate
> “(binding)” with the vote.
>
> [ ] +1 approve
> [ ] +0 no opinion
> [ ] -1 disapprove (and reason why)
>
> Best regards,
> Orhan Kislal <okislal@apache.org>
>

Mime
View raw message