madlib-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jingyi Mei <j...@pivotal.io>
Subject Re: [VOTE] MADlib v1.14-rc1
Date Sat, 28 Apr 2018 01:09:12 GMT
Hi Rashmi,

Thanks for the comments and feedback!

The release page with a page-not-found error should not be there since we
haven't made the actual release yet. We just removed the link in that page
and it will be added again after the community has voted and we have an
official release.

Concerning the documentation links for new features, it is definitely a
great idea to add them in the release notes and also vote email! Thanks for
the recommendation and we will see if we can make it better in this release.

Cheers,
Jingyi Mei

On Fri, Apr 27, 2018 at 3:19 PM, Rashmi Raghu <rraghu@pivotal.io> wrote:

> Installed on Postgres 9.6 on MacOS using dmg.
> Checked out the new additions to the summary function. Looks good. My
> vote: +1 (binding).
>
> Some comments aside from the vote:
>
>    - I followed this link in the email: https://cwiki.apache.
>    org/confluence/display/MADLIB/MADlib+1.14
>    <https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.14> and
>    then from there clicked on https://dist.apache.org/
>    repos/dist/release/madlib/1.14/
>    <https://dist.apache.org/repos/dist/release/madlib/1.14/> which gives
>    a page-not-found error.
>    - I didn't see a link to documentation associated with this release -
>    it would be useful to have that also available (let me know if it was in
>    the email and I missed it or if it is not standard practice). For instance,
>    I wanted to briefly look at the new balanced datasets module and it would
>    have been easy to look it up in the web version of the docs. I did find the
>    docs through the function call e.g. madlib.balance_sample('usage') but that
>    requires knowing roughly what function name to look for (not hard in this
>    case but I can imagine other situations where it might not be
>    straightforward)
>
> Great to see all the new features and bug fixes!
>
> Thanks,
> Rashmi
>
>
> On Fri, Apr 27, 2018 at 1:40 PM, Orhan Kislal <okislal@pivotal.io> wrote:
>
>> Tested on PG 10.3 (src and dmg). Looks good. +1 (binding)
>>
>> Thanks for preparing the release Jingyi,
>>
>> Orhan Kislal
>>
>> On Fri, Apr 27, 2018 at 11:44 AM, Frank McQuillan <fmcquillan@pivotal.io>
>> wrote:
>>
>>> Hi Jingyi,
>>>
>>> Thanks for posting the artifacts and sending out the vote.
>>>
>>> My findings:
>>>
>>> Installation and IC passed on postgres 9.6.7
>>>
>>> Also I tested a cpl of the new features (personalized page rank and
>>> mini-batch preprocessor)
>>> and they worked OK for me with a small sample data set.
>>>
>>> +1 (binding)
>>>
>>> On Thu, Apr 26, 2018 at 2:57 PM, Jingyi Mei <jmei@pivotal.io> wrote:
>>>
>>> > Hello Apache MADlib dev community,
>>> >
>>> > This is the vote for Apache MADlib 1.14 Release (RC1). It provides the
>>> > source release tarball and convenience binaries. This is the third
>>> > Apache MADlib release as an Apache Top Level Project (TLP).
>>> >
>>> > The vote will run for at least 72 working hours and will close on
>>> > Tuesday, May 1st, 2018 @ 6pm PDT. A minimum of 3 binding +1 votes and
>>> > more binding +1 than binding -1 are required to pass.
>>> >
>>> > The main goals of this release are:
>>> >
>>> > New features:
>>> >
>>> >    - New module - Balanced datasets: A sampling module to balance
>>> >    classification
>>> >    datasets by resampling using various techniques including
>>> >    undersampling,
>>> >    oversampling, uniform sampling or user-defined proportion sampling
>>> >    (MADLIB-1168)
>>> >    - Mini-batch: Added a mini-batch optimizer for MLP and a
>>> preprocessor
>>> >    function
>>> >    necessary to create batches from the data (MADLIB-1200, MADLIB-1206,
>>> >    MADLIB-1220, MADLIB-1224, MADLIB-1226, MADLIB-1227)
>>> >    - k-NN: Added weighted averaging/voting by distance (MADLIB-1181)
>>> >    - Summary: Added additional stats: number of positive, negative,
>>> zero
>>> >    values and
>>> >    95% confidence intervals for the mean (MADLIB-1167)
>>> >    - Encode categorical: Updated to produce lower-case column names
>>> when
>>> >    possible
>>> >    (MADLIB-1202)
>>> >    - MLP: Added support for already one-hot encoded categorical
>>> dependent
>>> >    variable
>>> >    in a classification task (MADLIB-1222)
>>> >    - Pagerank: Added option for personalized vertices that allows
>>> higher
>>> >    weightage
>>> >    for a subset of vertices which will have a higher jump probability
>>> as
>>> >    compared to other vertices and a random surfer is more likely to
>>> >    jump to these personalization vertices (MADLIB-1084)
>>> >
>>> > Bug fixes:
>>> >
>>> >    - Fixed issue with invalid calls of construct_array that led to
>>> >    problems
>>> >    in Postgresql 10 (MADLIB-1185)
>>> >    - Added newline between file concatenation during PGXN install
>>> >    (MADLIB-1194)
>>> >    - Fixed upgrade issues in knn (MADLIB-1197)
>>> >    - Added fix to ensure RF variable importance are always non-negative
>>> >    - Fixed inconsistency in LDA output and improved usability
>>> >    (MADLIB-1160, MADLIB-1201)
>>> >    - Fixed MLP and RF predict for models trained in earlier versions to
>>> >    ensure missing optional parameters are given appropriate default
>>> values
>>> >    (MADLIB-1207)
>>> >    - Fixed a scenario in DT where no features exist due categorical
>>> >    columns with single level being dropped led to the database crashing
>>> >    - Fixed step size initialization in MLP based on learning rate
>>> policy
>>> >    (MADLIB-1212)
>>> >    - Fixed PCA issue that leads to failure when grouping column is a
>>> TEXT
>>> >    type (MADLIB-1215)
>>> >    - Fixed cat levels output in DT when grouping is enabled
>>> (MADLIB-1218)
>>> >    - Fixed and simplified initialization of model coefficients in MLP
>>> >    - Removed source table dependency for predicting regression models
>>> in
>>> >    MLP (MADLIB-1223)
>>> >    - Print loss of first iteration in MLP (MADLIB-1228)
>>> >    - Fixed MLP failure on GPDB 4.3 when verbose=True (MADLIB-1209)
>>> >    - Fixed RF issue that showed up when var_importance=True with no
>>> >    continuous features (MADLIB-1219)
>>> >    - Fixed DT/RF issue for null_as_category=True and grouping enabled
>>> >    (MADLIB-1217)
>>> >
>>> > Other:
>>> >
>>> >    - Reduced install-check runtime for PCA, DT, RF, elastic net
>>> >    (MADLIB-1216)
>>> >    - Added CentOS 7 PostgreSQL 9.6/10 docker files
>>> >
>>> > For additional information, please see:
>>> > https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.14
>>> >
>>> > Here are the release artifact details:
>>> >
>>> > Source release tag to be voted on: rc/1.14-rc1, located here:
>>> > https://git-wip-us.apache.org/repos/asf?p=madlib.git;a=tag;
>>> > h=refs/tags/rc/1.14-rc1
>>> >
>>> > Source release tarball can be retrieved from the following locations:
>>> >
>>> > Package:
>>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>>> > apache-madlib-1.14-src.tar.gz
>>> > PGP Signature:
>>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>>> > apache-madlib-1.14-src.tar.gz.asc
>>> > SHA512 Hash:
>>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>>> > apache-madlib-1.14-src.tar.gz.sha512
>>> >
>>> > Convenience binary packages can be retrieved from the following
>>> > locations:
>>> >
>>> > macOS: 10.* PostgreSQL 9.6 & 10.2
>>> >
>>> > Package:
>>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>>> > apache-madlib-1.14-bin-Darwin.dmg
>>> > PGP Signature:
>>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>>> > apache-madlib-1.14-bin-Darwin.dmg.asc
>>> > SHA512 Hash:
>>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>>> > apache-madlib-1.14-bin-Darwin.dmg.sha512
>>> >
>>> > CentOS* GPDB 4.3.5+
>>> >
>>> > Package:
>>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>>> > apache-madlib-1.14-bin-Linux-GPDB43.rpm
>>> > PGP Signature:
>>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>>> > apache-madlib-1.14-bin-Linux-GPDB43.rpm.asc
>>> > SHA512 Hash:
>>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>>> > apache-madlib-1.14-bin-Linux-GPDB43.rpm.sha512
>>> >
>>> > CentOS 6 &* GPDB 5.3.0, PostgreSQL 9.6 & 10.2
>>> >
>>> > Package:
>>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>>> > apache-madlib-1.14-bin-Linux.rpm
>>> > PGP Signature:
>>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>>> > apache-madlib-1.14-bin-Linux.rpm.asc
>>> > SHA512 Hash:
>>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>>> > apache-madlib-1.14-bin-Linux.rpm.sha512
>>> >
>>> > The PGP KEYS file used to validate the signature of the release
>>> artifacts
>>> > is available here:
>>> > https://dist.apache.org/repos/dist/dev/madlib/KEYS
>>> >
>>> > To help in tallying the vote, PMC members please be sure to indicate
>>> > “(binding)” with the vote.
>>> >
>>> > [ ] +1 approve
>>> > [ ] +0 no opinion
>>> > [ ] -1 disapprove (and reason why)
>>> >
>>> > Regards,
>>> > Jingyi Mei
>>> >
>>> > Pivotal R&D Advanced Analytics
>>> > ​
>>> >
>>>
>>
>>
>
>
> --
> Rashmi Raghu, Ph.D.
> Pivotal Data Science
>

Mime
View raw message