madlib-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rashmi Raghu <rra...@pivotal.io>
Subject Re: [VOTE] MADlib v1.14-rc1
Date Fri, 27 Apr 2018 22:19:56 GMT
Installed on Postgres 9.6 on MacOS using dmg.
Checked out the new additions to the summary function. Looks good. My vote:
+1 (binding).

Some comments aside from the vote:

   - I followed this link in the email:
   https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.14 and then
   from there clicked on
   https://dist.apache.org/repos/dist/release/madlib/1.14/ which gives a
   page-not-found error.
   - I didn't see a link to documentation associated with this release - it
   would be useful to have that also available (let me know if it was in the
   email and I missed it or if it is not standard practice). For instance, I
   wanted to briefly look at the new balanced datasets module and it would
   have been easy to look it up in the web version of the docs. I did find the
   docs through the function call e.g. madlib.balance_sample('usage') but that
   requires knowing roughly what function name to look for (not hard in this
   case but I can imagine other situations where it might not be
   straightforward)

Great to see all the new features and bug fixes!

Thanks,
Rashmi


On Fri, Apr 27, 2018 at 1:40 PM, Orhan Kislal <okislal@pivotal.io> wrote:

> Tested on PG 10.3 (src and dmg). Looks good. +1 (binding)
>
> Thanks for preparing the release Jingyi,
>
> Orhan Kislal
>
> On Fri, Apr 27, 2018 at 11:44 AM, Frank McQuillan <fmcquillan@pivotal.io>
> wrote:
>
>> Hi Jingyi,
>>
>> Thanks for posting the artifacts and sending out the vote.
>>
>> My findings:
>>
>> Installation and IC passed on postgres 9.6.7
>>
>> Also I tested a cpl of the new features (personalized page rank and
>> mini-batch preprocessor)
>> and they worked OK for me with a small sample data set.
>>
>> +1 (binding)
>>
>> On Thu, Apr 26, 2018 at 2:57 PM, Jingyi Mei <jmei@pivotal.io> wrote:
>>
>> > Hello Apache MADlib dev community,
>> >
>> > This is the vote for Apache MADlib 1.14 Release (RC1). It provides the
>> > source release tarball and convenience binaries. This is the third
>> > Apache MADlib release as an Apache Top Level Project (TLP).
>> >
>> > The vote will run for at least 72 working hours and will close on
>> > Tuesday, May 1st, 2018 @ 6pm PDT. A minimum of 3 binding +1 votes and
>> > more binding +1 than binding -1 are required to pass.
>> >
>> > The main goals of this release are:
>> >
>> > New features:
>> >
>> >    - New module - Balanced datasets: A sampling module to balance
>> >    classification
>> >    datasets by resampling using various techniques including
>> >    undersampling,
>> >    oversampling, uniform sampling or user-defined proportion sampling
>> >    (MADLIB-1168)
>> >    - Mini-batch: Added a mini-batch optimizer for MLP and a preprocessor
>> >    function
>> >    necessary to create batches from the data (MADLIB-1200, MADLIB-1206,
>> >    MADLIB-1220, MADLIB-1224, MADLIB-1226, MADLIB-1227)
>> >    - k-NN: Added weighted averaging/voting by distance (MADLIB-1181)
>> >    - Summary: Added additional stats: number of positive, negative, zero
>> >    values and
>> >    95% confidence intervals for the mean (MADLIB-1167)
>> >    - Encode categorical: Updated to produce lower-case column names when
>> >    possible
>> >    (MADLIB-1202)
>> >    - MLP: Added support for already one-hot encoded categorical
>> dependent
>> >    variable
>> >    in a classification task (MADLIB-1222)
>> >    - Pagerank: Added option for personalized vertices that allows higher
>> >    weightage
>> >    for a subset of vertices which will have a higher jump probability as
>> >    compared to other vertices and a random surfer is more likely to
>> >    jump to these personalization vertices (MADLIB-1084)
>> >
>> > Bug fixes:
>> >
>> >    - Fixed issue with invalid calls of construct_array that led to
>> >    problems
>> >    in Postgresql 10 (MADLIB-1185)
>> >    - Added newline between file concatenation during PGXN install
>> >    (MADLIB-1194)
>> >    - Fixed upgrade issues in knn (MADLIB-1197)
>> >    - Added fix to ensure RF variable importance are always non-negative
>> >    - Fixed inconsistency in LDA output and improved usability
>> >    (MADLIB-1160, MADLIB-1201)
>> >    - Fixed MLP and RF predict for models trained in earlier versions to
>> >    ensure missing optional parameters are given appropriate default
>> values
>> >    (MADLIB-1207)
>> >    - Fixed a scenario in DT where no features exist due categorical
>> >    columns with single level being dropped led to the database crashing
>> >    - Fixed step size initialization in MLP based on learning rate policy
>> >    (MADLIB-1212)
>> >    - Fixed PCA issue that leads to failure when grouping column is a
>> TEXT
>> >    type (MADLIB-1215)
>> >    - Fixed cat levels output in DT when grouping is enabled
>> (MADLIB-1218)
>> >    - Fixed and simplified initialization of model coefficients in MLP
>> >    - Removed source table dependency for predicting regression models in
>> >    MLP (MADLIB-1223)
>> >    - Print loss of first iteration in MLP (MADLIB-1228)
>> >    - Fixed MLP failure on GPDB 4.3 when verbose=True (MADLIB-1209)
>> >    - Fixed RF issue that showed up when var_importance=True with no
>> >    continuous features (MADLIB-1219)
>> >    - Fixed DT/RF issue for null_as_category=True and grouping enabled
>> >    (MADLIB-1217)
>> >
>> > Other:
>> >
>> >    - Reduced install-check runtime for PCA, DT, RF, elastic net
>> >    (MADLIB-1216)
>> >    - Added CentOS 7 PostgreSQL 9.6/10 docker files
>> >
>> > For additional information, please see:
>> > https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.14
>> >
>> > Here are the release artifact details:
>> >
>> > Source release tag to be voted on: rc/1.14-rc1, located here:
>> > https://git-wip-us.apache.org/repos/asf?p=madlib.git;a=tag;
>> > h=refs/tags/rc/1.14-rc1
>> >
>> > Source release tarball can be retrieved from the following locations:
>> >
>> > Package:
>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>> > apache-madlib-1.14-src.tar.gz
>> > PGP Signature:
>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>> > apache-madlib-1.14-src.tar.gz.asc
>> > SHA512 Hash:
>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>> > apache-madlib-1.14-src.tar.gz.sha512
>> >
>> > Convenience binary packages can be retrieved from the following
>> > locations:
>> >
>> > macOS: 10.* PostgreSQL 9.6 & 10.2
>> >
>> > Package:
>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>> > apache-madlib-1.14-bin-Darwin.dmg
>> > PGP Signature:
>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>> > apache-madlib-1.14-bin-Darwin.dmg.asc
>> > SHA512 Hash:
>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>> > apache-madlib-1.14-bin-Darwin.dmg.sha512
>> >
>> > CentOS* GPDB 4.3.5+
>> >
>> > Package:
>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>> > apache-madlib-1.14-bin-Linux-GPDB43.rpm
>> > PGP Signature:
>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>> > apache-madlib-1.14-bin-Linux-GPDB43.rpm.asc
>> > SHA512 Hash:
>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>> > apache-madlib-1.14-bin-Linux-GPDB43.rpm.sha512
>> >
>> > CentOS 6 &* GPDB 5.3.0, PostgreSQL 9.6 & 10.2
>> >
>> > Package:
>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>> > apache-madlib-1.14-bin-Linux.rpm
>> > PGP Signature:
>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>> > apache-madlib-1.14-bin-Linux.rpm.asc
>> > SHA512 Hash:
>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>> > apache-madlib-1.14-bin-Linux.rpm.sha512
>> >
>> > The PGP KEYS file used to validate the signature of the release
>> artifacts
>> > is available here:
>> > https://dist.apache.org/repos/dist/dev/madlib/KEYS
>> >
>> > To help in tallying the vote, PMC members please be sure to indicate
>> > “(binding)” with the vote.
>> >
>> > [ ] +1 approve
>> > [ ] +0 no opinion
>> > [ ] -1 disapprove (and reason why)
>> >
>> > Regards,
>> > Jingyi Mei
>> >
>> > Pivotal R&D Advanced Analytics
>> > ​
>> >
>>
>
>


-- 
Rashmi Raghu, Ph.D.
Pivotal Data Science

Mime
View raw message