madlib-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Srivatsan Ramanujam <vatsan...@utexas.edu>
Subject Re: [VOTE] MADlib v1.14-rc1
Date Sat, 28 Apr 2018 01:56:02 GMT
Built from source and tested on Mac. (High Sierra - 10.13.3, cmake version
3.11.0-rc2, Postgres 9.6.4)

+1 (binding)




On Fri, Apr 27, 2018 at 6:09 PM, Jingyi Mei <jmei@pivotal.io> wrote:

> Hi Rashmi,
>
> Thanks for the comments and feedback!
>
> The release page with a page-not-found error should not be there since we
> haven't made the actual release yet. We just removed the link in that page
> and it will be added again after the community has voted and we have an
> official release.
>
> Concerning the documentation links for new features, it is definitely a
> great idea to add them in the release notes and also vote email! Thanks for
> the recommendation and we will see if we can make it better in this release.
>
> Cheers,
> Jingyi Mei
>
> On Fri, Apr 27, 2018 at 3:19 PM, Rashmi Raghu <rraghu@pivotal.io> wrote:
>
>> Installed on Postgres 9.6 on MacOS using dmg.
>> Checked out the new additions to the summary function. Looks good. My
>> vote: +1 (binding).
>>
>> Some comments aside from the vote:
>>
>>    - I followed this link in the email: https://cwiki.apache.or
>>    g/confluence/display/MADLIB/MADlib+1.14
>>    <https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.14> and
>>    then from there clicked on https://dist.apache.org/rep
>>    os/dist/release/madlib/1.14/ which gives a page-not-found error.
>>    - I didn't see a link to documentation associated with this release -
>>    it would be useful to have that also available (let me know if it was in
>>    the email and I missed it or if it is not standard practice). For instance,
>>    I wanted to briefly look at the new balanced datasets module and it would
>>    have been easy to look it up in the web version of the docs. I did find the
>>    docs through the function call e.g. madlib.balance_sample('usage') but that
>>    requires knowing roughly what function name to look for (not hard in this
>>    case but I can imagine other situations where it might not be
>>    straightforward)
>>
>> Great to see all the new features and bug fixes!
>>
>> Thanks,
>> Rashmi
>>
>>
>> On Fri, Apr 27, 2018 at 1:40 PM, Orhan Kislal <okislal@pivotal.io> wrote:
>>
>>> Tested on PG 10.3 (src and dmg). Looks good. +1 (binding)
>>>
>>> Thanks for preparing the release Jingyi,
>>>
>>> Orhan Kislal
>>>
>>> On Fri, Apr 27, 2018 at 11:44 AM, Frank McQuillan <fmcquillan@pivotal.io
>>> > wrote:
>>>
>>>> Hi Jingyi,
>>>>
>>>> Thanks for posting the artifacts and sending out the vote.
>>>>
>>>> My findings:
>>>>
>>>> Installation and IC passed on postgres 9.6.7
>>>>
>>>> Also I tested a cpl of the new features (personalized page rank and
>>>> mini-batch preprocessor)
>>>> and they worked OK for me with a small sample data set.
>>>>
>>>> +1 (binding)
>>>>
>>>> On Thu, Apr 26, 2018 at 2:57 PM, Jingyi Mei <jmei@pivotal.io> wrote:
>>>>
>>>> > Hello Apache MADlib dev community,
>>>> >
>>>> > This is the vote for Apache MADlib 1.14 Release (RC1). It provides the
>>>> > source release tarball and convenience binaries. This is the third
>>>> > Apache MADlib release as an Apache Top Level Project (TLP).
>>>> >
>>>> > The vote will run for at least 72 working hours and will close on
>>>> > Tuesday, May 1st, 2018 @ 6pm PDT. A minimum of 3 binding +1 votes and
>>>> > more binding +1 than binding -1 are required to pass.
>>>> >
>>>> > The main goals of this release are:
>>>> >
>>>> > New features:
>>>> >
>>>> >    - New module - Balanced datasets: A sampling module to balance
>>>> >    classification
>>>> >    datasets by resampling using various techniques including
>>>> >    undersampling,
>>>> >    oversampling, uniform sampling or user-defined proportion sampling
>>>> >    (MADLIB-1168)
>>>> >    - Mini-batch: Added a mini-batch optimizer for MLP and a
>>>> preprocessor
>>>> >    function
>>>> >    necessary to create batches from the data (MADLIB-1200,
>>>> MADLIB-1206,
>>>> >    MADLIB-1220, MADLIB-1224, MADLIB-1226, MADLIB-1227)
>>>> >    - k-NN: Added weighted averaging/voting by distance (MADLIB-1181)
>>>> >    - Summary: Added additional stats: number of positive, negative,
>>>> zero
>>>> >    values and
>>>> >    95% confidence intervals for the mean (MADLIB-1167)
>>>> >    - Encode categorical: Updated to produce lower-case column names
>>>> when
>>>> >    possible
>>>> >    (MADLIB-1202)
>>>> >    - MLP: Added support for already one-hot encoded categorical
>>>> dependent
>>>> >    variable
>>>> >    in a classification task (MADLIB-1222)
>>>> >    - Pagerank: Added option for personalized vertices that allows
>>>> higher
>>>> >    weightage
>>>> >    for a subset of vertices which will have a higher jump probability
>>>> as
>>>> >    compared to other vertices and a random surfer is more likely to
>>>> >    jump to these personalization vertices (MADLIB-1084)
>>>> >
>>>> > Bug fixes:
>>>> >
>>>> >    - Fixed issue with invalid calls of construct_array that led to
>>>> >    problems
>>>> >    in Postgresql 10 (MADLIB-1185)
>>>> >    - Added newline between file concatenation during PGXN install
>>>> >    (MADLIB-1194)
>>>> >    - Fixed upgrade issues in knn (MADLIB-1197)
>>>> >    - Added fix to ensure RF variable importance are always
>>>> non-negative
>>>> >    - Fixed inconsistency in LDA output and improved usability
>>>> >    (MADLIB-1160, MADLIB-1201)
>>>> >    - Fixed MLP and RF predict for models trained in earlier versions
>>>> to
>>>> >    ensure missing optional parameters are given appropriate default
>>>> values
>>>> >    (MADLIB-1207)
>>>> >    - Fixed a scenario in DT where no features exist due categorical
>>>> >    columns with single level being dropped led to the database
>>>> crashing
>>>> >    - Fixed step size initialization in MLP based on learning rate
>>>> policy
>>>> >    (MADLIB-1212)
>>>> >    - Fixed PCA issue that leads to failure when grouping column is a
>>>> TEXT
>>>> >    type (MADLIB-1215)
>>>> >    - Fixed cat levels output in DT when grouping is enabled
>>>> (MADLIB-1218)
>>>> >    - Fixed and simplified initialization of model coefficients in MLP
>>>> >    - Removed source table dependency for predicting regression models
>>>> in
>>>> >    MLP (MADLIB-1223)
>>>> >    - Print loss of first iteration in MLP (MADLIB-1228)
>>>> >    - Fixed MLP failure on GPDB 4.3 when verbose=True (MADLIB-1209)
>>>> >    - Fixed RF issue that showed up when var_importance=True with no
>>>> >    continuous features (MADLIB-1219)
>>>> >    - Fixed DT/RF issue for null_as_category=True and grouping enabled
>>>> >    (MADLIB-1217)
>>>> >
>>>> > Other:
>>>> >
>>>> >    - Reduced install-check runtime for PCA, DT, RF, elastic net
>>>> >    (MADLIB-1216)
>>>> >    - Added CentOS 7 PostgreSQL 9.6/10 docker files
>>>> >
>>>> > For additional information, please see:
>>>> > https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.14
>>>> >
>>>> > Here are the release artifact details:
>>>> >
>>>> > Source release tag to be voted on: rc/1.14-rc1, located here:
>>>> > https://git-wip-us.apache.org/repos/asf?p=madlib.git;a=tag;
>>>> > h=refs/tags/rc/1.14-rc1
>>>> >
>>>> > Source release tarball can be retrieved from the following locations:
>>>> >
>>>> > Package:
>>>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>>>> > apache-madlib-1.14-src.tar.gz
>>>> > PGP Signature:
>>>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>>>> > apache-madlib-1.14-src.tar.gz.asc
>>>> > SHA512 Hash:
>>>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>>>> > apache-madlib-1.14-src.tar.gz.sha512
>>>> >
>>>> > Convenience binary packages can be retrieved from the following
>>>> > locations:
>>>> >
>>>> > macOS: 10.* PostgreSQL 9.6 & 10.2
>>>> >
>>>> > Package:
>>>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>>>> > apache-madlib-1.14-bin-Darwin.dmg
>>>> > PGP Signature:
>>>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>>>> > apache-madlib-1.14-bin-Darwin.dmg.asc
>>>> > SHA512 Hash:
>>>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>>>> > apache-madlib-1.14-bin-Darwin.dmg.sha512
>>>> >
>>>> > CentOS* GPDB 4.3.5+
>>>> >
>>>> > Package:
>>>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>>>> > apache-madlib-1.14-bin-Linux-GPDB43.rpm
>>>> > PGP Signature:
>>>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>>>> > apache-madlib-1.14-bin-Linux-GPDB43.rpm.asc
>>>> > SHA512 Hash:
>>>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>>>> > apache-madlib-1.14-bin-Linux-GPDB43.rpm.sha512
>>>> >
>>>> > CentOS 6 &* GPDB 5.3.0, PostgreSQL 9.6 & 10.2
>>>> >
>>>> > Package:
>>>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>>>> > apache-madlib-1.14-bin-Linux.rpm
>>>> > PGP Signature:
>>>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>>>> > apache-madlib-1.14-bin-Linux.rpm.asc
>>>> > SHA512 Hash:
>>>> > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/
>>>> > apache-madlib-1.14-bin-Linux.rpm.sha512
>>>> >
>>>> > The PGP KEYS file used to validate the signature of the release
>>>> artifacts
>>>> > is available here:
>>>> > https://dist.apache.org/repos/dist/dev/madlib/KEYS
>>>> >
>>>> > To help in tallying the vote, PMC members please be sure to indicate
>>>> > “(binding)” with the vote.
>>>> >
>>>> > [ ] +1 approve
>>>> > [ ] +0 no opinion
>>>> > [ ] -1 disapprove (and reason why)
>>>> >
>>>> > Regards,
>>>> > Jingyi Mei
>>>> >
>>>> > Pivotal R&D Advanced Analytics
>>>> > ​
>>>> >
>>>>
>>>
>>>
>>
>>
>> --
>> Rashmi Raghu, Ph.D.
>> Pivotal Data Science
>>
>
>

Mime
View raw message