madlib-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Orhan Kislal <okis...@pivotal.io>
Subject Re: [VOTE] MADlib v1.17.0-rc2
Date Thu, 09 Apr 2020 22:43:20 GMT
Hello Apache MADlib community,

The vote for releasing Apache MADlib 1.17.0 (RC2) passed with 5 binding
+1s, 2 non-binding +1, and no 0 or -1 votes.

Below is a summary of the voting:

*Binding (PMC members) +1s (5):*
Frank McQuillan
Nikhil Kak
Xixuan (Aaron) Feng
Nandish Jayaram
Xiaocheng Tang

*Non-binding (non-PMC members) +1s (2):*
Ekta Khanna
Domino Valdano

Official vote thread:
https://lists.apache.org/thread.html/r01f12d62f67a914da4fe8b36ae0fcfc4756e833ef409d94bd0030a27%40%3Cdev.madlib.apache.org%3E

Thanks to all for taking the time to review and vote! We will now
update necessary links/files to proceed with the release.

Best,

Orhan Kislal

On Wed, Apr 8, 2020 at 9:17 PM Domino Valdano <dvaldano@pivotal.io> wrote:

> Tested on OSX 10.14.6 (Mojave) with gpdb5 assert build
>
> Installs and passes most of dev-check, aside from a couple known failures
> due to asserts.
>
>  PostgreSQL 8.3.23 (Greenplum Database 5.23.0+dev.10.g50015fed64 build
> dev) on x86_64-apple-darwin18.7.0, compiled by GCC Apple LLVM version
> 10.0.1 (clang-1001.0.46.4), 64-bit compiled on Nov  5 2019 08:49:26 (with
> assert checking)
>
> +1  (non-binding)
>
> Domino
>
> On Mon, Apr 6, 2020 at 4:49 PM Orhan Kislal <okislal@pivotal.io> wrote:
>
>> Hello Apache MADlib community,
>>
>> This is the vote for Apache MADlib 1.17.0 Release (RC2). It provides the
>> source release tarball and convenience binaries.
>>
>> We didn't hold a vote for RC1 because we discovered a minor issue before
>> sending the vote.
>>
>> The vote will run for at least 72 hours and will close on Thursday,
>> April 9, 2020 @ 23:59 UTC (16:59 PDT). A minimum of 3 binding +1 votes
>> and more binding +1 than binding -1 are required to pass.
>>
>> The main goals of this release are:
>>
>> New features
>>     - DL: Add optional params to madlib_keras_fit_multiple_model
>> (MADLIB-1397)
>>     - DL: Fit and evaluate changes for asymmetric cluster config
>> (MADLIB-1393)
>>     - DL: Make param search fit() function work with existing evaluate
>> and predict (MADLIB-1387)
>>     - DL: ParamSearch: Add utility function for generating model
>> selection table (MADLIB-1375)
>>     - DL: Predict changes for asymmetric cluster config (MADLIB-1394)
>>     - DL: Preprocessor should evenly distribute data on an arbitrary
>> number of segments (MADLIB-1378)
>>     - DL: Preprocessor support for asymmetric segment distribution
>> (MADLIB-1392)
>>     - DL: Remove model_arch_table column from the output of
>> load_model_selection_table (MADLIB-1381)
>>     - DL: Support DL predict without training on MADlib (MADLIB-1359)
>>     - DL: Transfer learning for multi-model (MADLIB-1389)
>>     - Kmeans: Add simple silhouette score for every point (MADLIB-1382)
>>     - Kmeans: Select number of centroids in k-means (MADLIB-1380)
>>     - PostgreSQL 12 support (MADLIB-1391)
>>
>> Improvements:
>>     - Assoc rules: Add option to set number of posterior in association
>> rules (MADLIB-1327)
>>     - Correlation: Improve correlation and covariance memory usage with
>> large number of groups (MADLIB-1301)
>>     - DL: helper function for asymmetric cluster config (MADLIB-1390)
>>     - DL: Mini-batch preprocessor for images - performance issue
>> (MADLIB-1342)
>>     - DL: Modify warm start logic for DL to handle case of missing weight
>> (MADLIB-1400)
>>     - DL: Param search for multiple models on MPP architecture
>> (MADLIB-1386)
>>     - DL: performance improvements to fit transition function
>> (MADLIB-1418)
>>     - Docs: Enhance Installation Guides (MADLIB-1399)
>>     - Graph: SSSP should not show vertices in output table that are
>> unreachable (MADLIB-1415)
>>     - Knn - add zero check and output distance array (MADLIB-1370)
>>     - LDA: Add stopping criteria on perplexity to LDA (MADLIB-1351)
>>     - Summary: Last optional param in summary errors when NULL
>> (MADLIB-1413)
>>     - Summary: Summary function has dups for MFV for approximate results
>> (MADLIB-1412)
>>     - SVM: Change default num_components for SVM to max(100,
>> 2*num_features) (MADLIB-1384)
>>
>> Bug fixes:
>>     - DL: Deep Learning module does not work with tables in non-public
>> schemas (MADLIB-1388)
>>     - DL: Exception during madlib_keras_fit when model_arch_id is passed
>> as NULL (MADLIB-1371)
>>     - DL: fit and fit multiple fail with memory exception in gpdb6
>> (MADLIB-1405)
>>     - DL: fit multiple takes up unnecessary disk space (MADLIB-1406)
>>     - DL: Intermediate tables are not dropped  (MADLIB-1404)
>>     - DL: MADlib Keras operations create too many threads (MADLIB-1372)
>>     - DL: metrics_elapsed_time for fit multi_model not captured correctly
>> (MADLIB-1403)
>>     - DL: predict fails with OOM in gpdb6 (MADLIB-1414)
>>     - DL: Remove final function for fit multiple (MADLIB-1416)
>>     - DL: Support schema qualified output tables for fit and fit_multiple
>> (MADLIB-1417)
>>     - Graph: APSP fails if both vertex id column and edge src column has
>> the same name (MADLIB-1407)
>>     - Graph: ASPS Path Function fails if src or dest column type is
>> bigint (MADLIB-1408)
>>     - Graph: Graph/wcc fails if the user specifies a schema for the
>> output table (MADLIB-1411)
>>     - Kmeans: k-means related functions must use same default distance
>> function (MADLIB-1383)
>>     - LDA: Term frequency and LDA - turn off notices (MADLIB-1395)
>>     - MADlib cannot be built on PowerPC machines with Linux (MADLIB-1410)
>>     - Pivot:  Pivot documentation should say "out_table" instead of
>> "output_table" (MADLIB-1376)
>>
>> Other:
>>     - DL: Support up to Keras version 2.2.4, Tensorflow version 1.14
>>     - DL: If 'madlib_keras_fit_multiple_model()' is running on GPDB 5 and
>> some versions of GPDB 6, the database will keep adding to the disk space
>> (in proportion to model size) and will only release the disk space once the
>> fit multiple query has completed execution. This is not the case for GPDB
>> 6.5.0+ where disk space is released during the fit multiple query.
>>     - DL: CUDA GPU memory cannot be released until the process holding it
>> is terminated.  This process holds the GPU memory until one of the
>> following two things happen: query finishes and user logs out of the
>> Postgres client/session; or, query finishes and user waits for the timeout
>> set by `gp_vmem_idle_resource_timeout`. The default value for this timeout
>> in Greenplum is 18 sec, but it can be changed.
>>     - DL: pg_temp is not allowed as an output table schema for
>> madlib_keras_fit_multiple_model().
>>     - Build: Enable current versions of bison
>>     - Build: Add cmake variable for gppkg filename
>>     - Build: Add pull request template
>>
>> 1.17.0 docs available here:
>> http://madlib.apache.org/docs/rc/index.html
>>
>> For additional information, please see:
>> https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.17.0
>>
>> Here are the release artifact details:
>>
>> Source release tag to be voted on: rc/1.17.0-rc2, located here:
>>
>> https://git-wip-us.apache.org/repos/asf?p=madlib.git;a=tag;h=refs/tags/rc/1.17.0-rc2
>>
>> Source release tarball can be retrieved from the following locations:
>> Package:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz
>> PGP Signature:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.asc
>> SHA512 Hash:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.sha512
>>
>> Convenience binary packages can be retrieved from the following
>> locations:
>>
>> macOS: 10.14 GPDB 5.* & 6.*, PostgreSQL 11 & 12
>>
>> Package:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg
>> PGP Signature:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.asc
>> SHA512 Hash:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.sha512
>>
>> CentOS 5.* GPDB 4.3.5+ (compiled with gcc 4.4)
>>
>> Package:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm
>> PGP Signature:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.asc
>> SHA512 Hash:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.sha512
>>
>> CentOS 6 (tested on CentOS 7 as well), GPDB 5.* & 6.*, PostgreSQL 11 & 12
>> (compiled with gcc 6.2)
>>
>> Package:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm
>> PGP Signature:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.asc
>> SHA512 Hash:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.sha512
>>
>> Ubuntu 18.04 GPDB 5.* & 6.*,PostgreSQL 11 & 12 (compiled with gcc 7.4)
>>
>> Package:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb
>> PGP Signature:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.asc
>> SHA512 Hash:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.sha512
>>
>> The PGP KEYS file used to validate the signature of the release artifacts
>> is available here:
>> https://dist.apache.org/repos/dist/dev/madlib/KEYS
>>
>> To help in tallying the vote, PMC members please be sure to indicate
>> “(binding)” with the vote.
>>
>> [ ] +1 approve
>> [ ] +0 no opinion
>> [ ] -1 disapprove (and reason why)
>>
>> Best regards,
>> Orhan Kislal <okislal@apache.org>
>>
>
>
> --
> Domino Valdano <dvaldano@vmware.com>
> Pronouns:  She/Her
> VMware Staff Software Engineer
> Modern Applications Platform
>

Mime
View raw message