You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@madlib.apache.org by Orhan Kislal <ok...@pivotal.io> on 2020/04/06 23:48:58 UTC

[VOTE] MADlib v1.17.0-rc2

Hello Apache MADlib community,

This is the vote for Apache MADlib 1.17.0 Release (RC2). It provides the
source release tarball and convenience binaries.

We didn't hold a vote for RC1 because we discovered a minor issue before
sending the vote.

The vote will run for at least 72 hours and will close on Thursday,
April 9, 2020 @ 23:59 UTC (16:59 PDT). A minimum of 3 binding +1 votes
and more binding +1 than binding -1 are required to pass.

The main goals of this release are:

New features
    - DL: Add optional params to madlib_keras_fit_multiple_model
(MADLIB-1397)
    - DL: Fit and evaluate changes for asymmetric cluster config
(MADLIB-1393)
    - DL: Make param search fit() function work with existing evaluate and
predict (MADLIB-1387)
    - DL: ParamSearch: Add utility function for generating model selection
table (MADLIB-1375)
    - DL: Predict changes for asymmetric cluster config (MADLIB-1394)
    - DL: Preprocessor should evenly distribute data on an arbitrary number
of segments (MADLIB-1378)
    - DL: Preprocessor support for asymmetric segment distribution
(MADLIB-1392)
    - DL: Remove model_arch_table column from the output of
load_model_selection_table (MADLIB-1381)
    - DL: Support DL predict without training on MADlib (MADLIB-1359)
    - DL: Transfer learning for multi-model (MADLIB-1389)
    - Kmeans: Add simple silhouette score for every point (MADLIB-1382)
    - Kmeans: Select number of centroids in k-means (MADLIB-1380)
    - PostgreSQL 12 support (MADLIB-1391)

Improvements:
    - Assoc rules: Add option to set number of posterior in association
rules (MADLIB-1327)
    - Correlation: Improve correlation and covariance memory usage with
large number of groups (MADLIB-1301)
    - DL: helper function for asymmetric cluster config (MADLIB-1390)
    - DL: Mini-batch preprocessor for images - performance issue
(MADLIB-1342)
    - DL: Modify warm start logic for DL to handle case of missing weight
(MADLIB-1400)
    - DL: Param search for multiple models on MPP architecture (MADLIB-1386)
    - DL: performance improvements to fit transition function (MADLIB-1418)
    - Docs: Enhance Installation Guides (MADLIB-1399)
    - Graph: SSSP should not show vertices in output table that are
unreachable (MADLIB-1415)
    - Knn - add zero check and output distance array (MADLIB-1370)
    - LDA: Add stopping criteria on perplexity to LDA (MADLIB-1351)
    - Summary: Last optional param in summary errors when NULL (MADLIB-1413)
    - Summary: Summary function has dups for MFV for approximate results
(MADLIB-1412)
    - SVM: Change default num_components for SVM to max(100,
2*num_features) (MADLIB-1384)

Bug fixes:
    - DL: Deep Learning module does not work with tables in non-public
schemas (MADLIB-1388)
    - DL: Exception during madlib_keras_fit when model_arch_id is passed as
NULL (MADLIB-1371)
    - DL: fit and fit multiple fail with memory exception in gpdb6
(MADLIB-1405)
    - DL: fit multiple takes up unnecessary disk space (MADLIB-1406)
    - DL: Intermediate tables are not dropped  (MADLIB-1404)
    - DL: MADlib Keras operations create too many threads (MADLIB-1372)
    - DL: metrics_elapsed_time for fit multi_model not captured correctly
(MADLIB-1403)
    - DL: predict fails with OOM in gpdb6 (MADLIB-1414)
    - DL: Remove final function for fit multiple (MADLIB-1416)
    - DL: Support schema qualified output tables for fit and fit_multiple
(MADLIB-1417)
    - Graph: APSP fails if both vertex id column and edge src column has
the same name (MADLIB-1407)
    - Graph: ASPS Path Function fails if src or dest column type is bigint
(MADLIB-1408)
    - Graph: Graph/wcc fails if the user specifies a schema for the output
table (MADLIB-1411)
    - Kmeans: k-means related functions must use same default distance
function (MADLIB-1383)
    - LDA: Term frequency and LDA - turn off notices (MADLIB-1395)
    - MADlib cannot be built on PowerPC machines with Linux (MADLIB-1410)
    - Pivot:  Pivot documentation should say "out_table" instead of
"output_table" (MADLIB-1376)

Other:
    - DL: Support up to Keras version 2.2.4, Tensorflow version 1.14
    - DL: If 'madlib_keras_fit_multiple_model()' is running on GPDB 5 and
some versions of GPDB 6, the database will keep adding to the disk space
(in proportion to model size) and will only release the disk space once the
fit multiple query has completed execution. This is not the case for GPDB
6.5.0+ where disk space is released during the fit multiple query.
    - DL: CUDA GPU memory cannot be released until the process holding it
is terminated.  This process holds the GPU memory until one of the
following two things happen: query finishes and user logs out of the
Postgres client/session; or, query finishes and user waits for the timeout
set by `gp_vmem_idle_resource_timeout`. The default value for this timeout
in Greenplum is 18 sec, but it can be changed.
    - DL: pg_temp is not allowed as an output table schema for
madlib_keras_fit_multiple_model().
    - Build: Enable current versions of bison
    - Build: Add cmake variable for gppkg filename
    - Build: Add pull request template

1.17.0 docs available here:
http://madlib.apache.org/docs/rc/index.html

For additional information, please see:
https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.17.0

Here are the release artifact details:

Source release tag to be voted on: rc/1.17.0-rc2, located here:
https://git-wip-us.apache.org/repos/asf?p=madlib.git;a=tag;h=refs/tags/rc/1.17.0-rc2

Source release tarball can be retrieved from the following locations:
Package:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz
PGP Signature:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.asc
SHA512 Hash:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.sha512

Convenience binary packages can be retrieved from the following
locations:

macOS: 10.14 GPDB 5.* & 6.*, PostgreSQL 11 & 12

Package:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg
PGP Signature:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.asc
SHA512 Hash:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.sha512

CentOS 5.* GPDB 4.3.5+ (compiled with gcc 4.4)

Package:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm
PGP Signature:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.asc
SHA512 Hash:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.sha512

CentOS 6 (tested on CentOS 7 as well), GPDB 5.* & 6.*, PostgreSQL 11 & 12
(compiled with gcc 6.2)

Package:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm
PGP Signature:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.asc
SHA512 Hash:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.sha512

Ubuntu 18.04 GPDB 5.* & 6.*,PostgreSQL 11 & 12 (compiled with gcc 7.4)

Package:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb
PGP Signature:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.asc
SHA512 Hash:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.sha512

The PGP KEYS file used to validate the signature of the release artifacts
is available here:
https://dist.apache.org/repos/dist/dev/madlib/KEYS

To help in tallying the vote, PMC members please be sure to indicate
“(binding)” with the vote.

[ ] +1 approve
[ ] +0 no opinion
[ ] -1 disapprove (and reason why)

Best regards,
Orhan Kislal <ok...@apache.org>

Re: [VOTE] MADlib v1.17.0-rc2

Posted by "FENG, Xixuan (Aaron)" <xi...@gmail.com>.
+1 binding

Install check passed on Postgres 11 of Ubuntu.

Thanks for the hard work!

2020年4月9日(木) 8:58 Nikhil Kak <nk...@pivotal.io>:

> +1 binding
>
> Successfully installed madlib and ran install-check and dev check with
> 1. dmg on macOS High Sierra 10.13.6 with gpdb 6.6
> 2. src tar gz compiled with gcc 6.5.0 on macOS High Sierra 10.13.6 with
> gpdb 6.6
>
> - Nikhil
>
> On Wed, Apr 8, 2020 at 4:53 PM Ekta Khanna <ek...@pivotal.io> wrote:
>
>> +1
>>
>> Tested on OSX (10.13.6) with GPDB 5.24.0, GPDB 6.5.0 and PG11.6
>> - passed install-check, dev-check and unit-test
>>
>> Tested on Ubuntu 18.04 with GPDB 6.5.0
>> - passed install-check, dev-check and unit-test
>>
>> LGTM!
>>
>> Thanks,
>> Ekta
>>
>> On Mon, Apr 6, 2020 at 4:49 PM Orhan Kislal <ok...@pivotal.io> wrote:
>>
>> > Hello Apache MADlib community,
>> >
>> > This is the vote for Apache MADlib 1.17.0 Release (RC2). It provides the
>> > source release tarball and convenience binaries.
>> >
>> > We didn't hold a vote for RC1 because we discovered a minor issue before
>> > sending the vote.
>> >
>> > The vote will run for at least 72 hours and will close on Thursday,
>> > April 9, 2020 @ 23:59 UTC (16:59 PDT). A minimum of 3 binding +1 votes
>> > and more binding +1 than binding -1 are required to pass.
>> >
>> > The main goals of this release are:
>> >
>> > New features
>> >     - DL: Add optional params to madlib_keras_fit_multiple_model
>> > (MADLIB-1397)
>> >     - DL: Fit and evaluate changes for asymmetric cluster config
>> > (MADLIB-1393)
>> >     - DL: Make param search fit() function work with existing evaluate
>> and
>> > predict (MADLIB-1387)
>> >     - DL: ParamSearch: Add utility function for generating model
>> selection
>> > table (MADLIB-1375)
>> >     - DL: Predict changes for asymmetric cluster config (MADLIB-1394)
>> >     - DL: Preprocessor should evenly distribute data on an arbitrary
>> > number of segments (MADLIB-1378)
>> >     - DL: Preprocessor support for asymmetric segment distribution
>> > (MADLIB-1392)
>> >     - DL: Remove model_arch_table column from the output of
>> > load_model_selection_table (MADLIB-1381)
>> >     - DL: Support DL predict without training on MADlib (MADLIB-1359)
>> >     - DL: Transfer learning for multi-model (MADLIB-1389)
>> >     - Kmeans: Add simple silhouette score for every point (MADLIB-1382)
>> >     - Kmeans: Select number of centroids in k-means (MADLIB-1380)
>> >     - PostgreSQL 12 support (MADLIB-1391)
>> >
>> > Improvements:
>> >     - Assoc rules: Add option to set number of posterior in association
>> > rules (MADLIB-1327)
>> >     - Correlation: Improve correlation and covariance memory usage with
>> > large number of groups (MADLIB-1301)
>> >     - DL: helper function for asymmetric cluster config (MADLIB-1390)
>> >     - DL: Mini-batch preprocessor for images - performance issue
>> > (MADLIB-1342)
>> >     - DL: Modify warm start logic for DL to handle case of missing
>> weight
>> > (MADLIB-1400)
>> >     - DL: Param search for multiple models on MPP architecture
>> > (MADLIB-1386)
>> >     - DL: performance improvements to fit transition function
>> (MADLIB-1418)
>> >     - Docs: Enhance Installation Guides (MADLIB-1399)
>> >     - Graph: SSSP should not show vertices in output table that are
>> > unreachable (MADLIB-1415)
>> >     - Knn - add zero check and output distance array (MADLIB-1370)
>> >     - LDA: Add stopping criteria on perplexity to LDA (MADLIB-1351)
>> >     - Summary: Last optional param in summary errors when NULL
>> > (MADLIB-1413)
>> >     - Summary: Summary function has dups for MFV for approximate results
>> > (MADLIB-1412)
>> >     - SVM: Change default num_components for SVM to max(100,
>> > 2*num_features) (MADLIB-1384)
>> >
>> > Bug fixes:
>> >     - DL: Deep Learning module does not work with tables in non-public
>> > schemas (MADLIB-1388)
>> >     - DL: Exception during madlib_keras_fit when model_arch_id is passed
>> > as NULL (MADLIB-1371)
>> >     - DL: fit and fit multiple fail with memory exception in gpdb6
>> > (MADLIB-1405)
>> >     - DL: fit multiple takes up unnecessary disk space (MADLIB-1406)
>> >     - DL: Intermediate tables are not dropped  (MADLIB-1404)
>> >     - DL: MADlib Keras operations create too many threads (MADLIB-1372)
>> >     - DL: metrics_elapsed_time for fit multi_model not captured
>> correctly
>> > (MADLIB-1403)
>> >     - DL: predict fails with OOM in gpdb6 (MADLIB-1414)
>> >     - DL: Remove final function for fit multiple (MADLIB-1416)
>> >     - DL: Support schema qualified output tables for fit and
>> fit_multiple
>> > (MADLIB-1417)
>> >     - Graph: APSP fails if both vertex id column and edge src column has
>> > the same name (MADLIB-1407)
>> >     - Graph: ASPS Path Function fails if src or dest column type is
>> bigint
>> > (MADLIB-1408)
>> >     - Graph: Graph/wcc fails if the user specifies a schema for the
>> output
>> > table (MADLIB-1411)
>> >     - Kmeans: k-means related functions must use same default distance
>> > function (MADLIB-1383)
>> >     - LDA: Term frequency and LDA - turn off notices (MADLIB-1395)
>> >     - MADlib cannot be built on PowerPC machines with Linux
>> (MADLIB-1410)
>> >     - Pivot:  Pivot documentation should say "out_table" instead of
>> > "output_table" (MADLIB-1376)
>> >
>> > Other:
>> >     - DL: Support up to Keras version 2.2.4, Tensorflow version 1.14
>> >     - DL: If 'madlib_keras_fit_multiple_model()' is running on GPDB 5
>> and
>> > some versions of GPDB 6, the database will keep adding to the disk space
>> > (in proportion to model size) and will only release the disk space once
>> the
>> > fit multiple query has completed execution. This is not the case for
>> GPDB
>> > 6.5.0+ where disk space is released during the fit multiple query.
>> >     - DL: CUDA GPU memory cannot be released until the process holding
>> it
>> > is terminated.  This process holds the GPU memory until one of the
>> > following two things happen: query finishes and user logs out of the
>> > Postgres client/session; or, query finishes and user waits for the
>> timeout
>> > set by `gp_vmem_idle_resource_timeout`. The default value for this
>> timeout
>> > in Greenplum is 18 sec, but it can be changed.
>> >     - DL: pg_temp is not allowed as an output table schema for
>> > madlib_keras_fit_multiple_model().
>> >     - Build: Enable current versions of bison
>> >     - Build: Add cmake variable for gppkg filename
>> >     - Build: Add pull request template
>> >
>> > 1.17.0 docs available here:
>> > http://madlib.apache.org/docs/rc/index.html
>> >
>> > For additional information, please see:
>> > https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.17.0
>> >
>> > Here are the release artifact details:
>> >
>> > Source release tag to be voted on: rc/1.17.0-rc2, located here:
>> >
>> >
>> https://git-wip-us.apache.org/repos/asf?p=madlib.git;a=tag;h=refs/tags/rc/1.17.0-rc2
>> >
>> > Source release tarball can be retrieved from the following locations:
>> > Package:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz
>> > PGP Signature:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.asc
>> > SHA512 Hash:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.sha512
>> >
>> > Convenience binary packages can be retrieved from the following
>> > locations:
>> >
>> > macOS: 10.14 GPDB 5.* & 6.*, PostgreSQL 11 & 12
>> >
>> > Package:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg
>> > PGP Signature:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.asc
>> > SHA512 Hash:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.sha512
>> >
>> > CentOS 5.* GPDB 4.3.5+ (compiled with gcc 4.4)
>> >
>> > Package:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm
>> > PGP Signature:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.asc
>> > SHA512 Hash:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.sha512
>> >
>> > CentOS 6 (tested on CentOS 7 as well), GPDB 5.* & 6.*, PostgreSQL 11 &
>> 12
>> > (compiled with gcc 6.2)
>> >
>> > Package:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm
>> > PGP Signature:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.asc
>> > SHA512 Hash:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.sha512
>> >
>> > Ubuntu 18.04 GPDB 5.* & 6.*,PostgreSQL 11 & 12 (compiled with gcc 7.4)
>> >
>> > Package:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb
>> > PGP Signature:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.asc
>> > SHA512 Hash:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.sha512
>> >
>> > The PGP KEYS file used to validate the signature of the release
>> artifacts
>> > is available here:
>> > https://dist.apache.org/repos/dist/dev/madlib/KEYS
>> >
>> > To help in tallying the vote, PMC members please be sure to indicate
>> > “(binding)” with the vote.
>> >
>> > [ ] +1 approve
>> > [ ] +0 no opinion
>> > [ ] -1 disapprove (and reason why)
>> >
>> > Best regards,
>> > Orhan Kislal <ok...@apache.org>
>> >
>>
>
>
> --
> Thanks,
> Nikhil Kak
>

Re: [VOTE] MADlib v1.17.0-rc2

Posted by "FENG, Xixuan (Aaron)" <xi...@gmail.com>.
+1 binding

Install check passed on Postgres 11 of Ubuntu.

Thanks for the hard work!

2020年4月9日(木) 8:58 Nikhil Kak <nk...@pivotal.io>:

> +1 binding
>
> Successfully installed madlib and ran install-check and dev check with
> 1. dmg on macOS High Sierra 10.13.6 with gpdb 6.6
> 2. src tar gz compiled with gcc 6.5.0 on macOS High Sierra 10.13.6 with
> gpdb 6.6
>
> - Nikhil
>
> On Wed, Apr 8, 2020 at 4:53 PM Ekta Khanna <ek...@pivotal.io> wrote:
>
>> +1
>>
>> Tested on OSX (10.13.6) with GPDB 5.24.0, GPDB 6.5.0 and PG11.6
>> - passed install-check, dev-check and unit-test
>>
>> Tested on Ubuntu 18.04 with GPDB 6.5.0
>> - passed install-check, dev-check and unit-test
>>
>> LGTM!
>>
>> Thanks,
>> Ekta
>>
>> On Mon, Apr 6, 2020 at 4:49 PM Orhan Kislal <ok...@pivotal.io> wrote:
>>
>> > Hello Apache MADlib community,
>> >
>> > This is the vote for Apache MADlib 1.17.0 Release (RC2). It provides the
>> > source release tarball and convenience binaries.
>> >
>> > We didn't hold a vote for RC1 because we discovered a minor issue before
>> > sending the vote.
>> >
>> > The vote will run for at least 72 hours and will close on Thursday,
>> > April 9, 2020 @ 23:59 UTC (16:59 PDT). A minimum of 3 binding +1 votes
>> > and more binding +1 than binding -1 are required to pass.
>> >
>> > The main goals of this release are:
>> >
>> > New features
>> >     - DL: Add optional params to madlib_keras_fit_multiple_model
>> > (MADLIB-1397)
>> >     - DL: Fit and evaluate changes for asymmetric cluster config
>> > (MADLIB-1393)
>> >     - DL: Make param search fit() function work with existing evaluate
>> and
>> > predict (MADLIB-1387)
>> >     - DL: ParamSearch: Add utility function for generating model
>> selection
>> > table (MADLIB-1375)
>> >     - DL: Predict changes for asymmetric cluster config (MADLIB-1394)
>> >     - DL: Preprocessor should evenly distribute data on an arbitrary
>> > number of segments (MADLIB-1378)
>> >     - DL: Preprocessor support for asymmetric segment distribution
>> > (MADLIB-1392)
>> >     - DL: Remove model_arch_table column from the output of
>> > load_model_selection_table (MADLIB-1381)
>> >     - DL: Support DL predict without training on MADlib (MADLIB-1359)
>> >     - DL: Transfer learning for multi-model (MADLIB-1389)
>> >     - Kmeans: Add simple silhouette score for every point (MADLIB-1382)
>> >     - Kmeans: Select number of centroids in k-means (MADLIB-1380)
>> >     - PostgreSQL 12 support (MADLIB-1391)
>> >
>> > Improvements:
>> >     - Assoc rules: Add option to set number of posterior in association
>> > rules (MADLIB-1327)
>> >     - Correlation: Improve correlation and covariance memory usage with
>> > large number of groups (MADLIB-1301)
>> >     - DL: helper function for asymmetric cluster config (MADLIB-1390)
>> >     - DL: Mini-batch preprocessor for images - performance issue
>> > (MADLIB-1342)
>> >     - DL: Modify warm start logic for DL to handle case of missing
>> weight
>> > (MADLIB-1400)
>> >     - DL: Param search for multiple models on MPP architecture
>> > (MADLIB-1386)
>> >     - DL: performance improvements to fit transition function
>> (MADLIB-1418)
>> >     - Docs: Enhance Installation Guides (MADLIB-1399)
>> >     - Graph: SSSP should not show vertices in output table that are
>> > unreachable (MADLIB-1415)
>> >     - Knn - add zero check and output distance array (MADLIB-1370)
>> >     - LDA: Add stopping criteria on perplexity to LDA (MADLIB-1351)
>> >     - Summary: Last optional param in summary errors when NULL
>> > (MADLIB-1413)
>> >     - Summary: Summary function has dups for MFV for approximate results
>> > (MADLIB-1412)
>> >     - SVM: Change default num_components for SVM to max(100,
>> > 2*num_features) (MADLIB-1384)
>> >
>> > Bug fixes:
>> >     - DL: Deep Learning module does not work with tables in non-public
>> > schemas (MADLIB-1388)
>> >     - DL: Exception during madlib_keras_fit when model_arch_id is passed
>> > as NULL (MADLIB-1371)
>> >     - DL: fit and fit multiple fail with memory exception in gpdb6
>> > (MADLIB-1405)
>> >     - DL: fit multiple takes up unnecessary disk space (MADLIB-1406)
>> >     - DL: Intermediate tables are not dropped  (MADLIB-1404)
>> >     - DL: MADlib Keras operations create too many threads (MADLIB-1372)
>> >     - DL: metrics_elapsed_time for fit multi_model not captured
>> correctly
>> > (MADLIB-1403)
>> >     - DL: predict fails with OOM in gpdb6 (MADLIB-1414)
>> >     - DL: Remove final function for fit multiple (MADLIB-1416)
>> >     - DL: Support schema qualified output tables for fit and
>> fit_multiple
>> > (MADLIB-1417)
>> >     - Graph: APSP fails if both vertex id column and edge src column has
>> > the same name (MADLIB-1407)
>> >     - Graph: ASPS Path Function fails if src or dest column type is
>> bigint
>> > (MADLIB-1408)
>> >     - Graph: Graph/wcc fails if the user specifies a schema for the
>> output
>> > table (MADLIB-1411)
>> >     - Kmeans: k-means related functions must use same default distance
>> > function (MADLIB-1383)
>> >     - LDA: Term frequency and LDA - turn off notices (MADLIB-1395)
>> >     - MADlib cannot be built on PowerPC machines with Linux
>> (MADLIB-1410)
>> >     - Pivot:  Pivot documentation should say "out_table" instead of
>> > "output_table" (MADLIB-1376)
>> >
>> > Other:
>> >     - DL: Support up to Keras version 2.2.4, Tensorflow version 1.14
>> >     - DL: If 'madlib_keras_fit_multiple_model()' is running on GPDB 5
>> and
>> > some versions of GPDB 6, the database will keep adding to the disk space
>> > (in proportion to model size) and will only release the disk space once
>> the
>> > fit multiple query has completed execution. This is not the case for
>> GPDB
>> > 6.5.0+ where disk space is released during the fit multiple query.
>> >     - DL: CUDA GPU memory cannot be released until the process holding
>> it
>> > is terminated.  This process holds the GPU memory until one of the
>> > following two things happen: query finishes and user logs out of the
>> > Postgres client/session; or, query finishes and user waits for the
>> timeout
>> > set by `gp_vmem_idle_resource_timeout`. The default value for this
>> timeout
>> > in Greenplum is 18 sec, but it can be changed.
>> >     - DL: pg_temp is not allowed as an output table schema for
>> > madlib_keras_fit_multiple_model().
>> >     - Build: Enable current versions of bison
>> >     - Build: Add cmake variable for gppkg filename
>> >     - Build: Add pull request template
>> >
>> > 1.17.0 docs available here:
>> > http://madlib.apache.org/docs/rc/index.html
>> >
>> > For additional information, please see:
>> > https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.17.0
>> >
>> > Here are the release artifact details:
>> >
>> > Source release tag to be voted on: rc/1.17.0-rc2, located here:
>> >
>> >
>> https://git-wip-us.apache.org/repos/asf?p=madlib.git;a=tag;h=refs/tags/rc/1.17.0-rc2
>> >
>> > Source release tarball can be retrieved from the following locations:
>> > Package:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz
>> > PGP Signature:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.asc
>> > SHA512 Hash:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.sha512
>> >
>> > Convenience binary packages can be retrieved from the following
>> > locations:
>> >
>> > macOS: 10.14 GPDB 5.* & 6.*, PostgreSQL 11 & 12
>> >
>> > Package:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg
>> > PGP Signature:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.asc
>> > SHA512 Hash:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.sha512
>> >
>> > CentOS 5.* GPDB 4.3.5+ (compiled with gcc 4.4)
>> >
>> > Package:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm
>> > PGP Signature:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.asc
>> > SHA512 Hash:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.sha512
>> >
>> > CentOS 6 (tested on CentOS 7 as well), GPDB 5.* & 6.*, PostgreSQL 11 &
>> 12
>> > (compiled with gcc 6.2)
>> >
>> > Package:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm
>> > PGP Signature:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.asc
>> > SHA512 Hash:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.sha512
>> >
>> > Ubuntu 18.04 GPDB 5.* & 6.*,PostgreSQL 11 & 12 (compiled with gcc 7.4)
>> >
>> > Package:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb
>> > PGP Signature:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.asc
>> > SHA512 Hash:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.sha512
>> >
>> > The PGP KEYS file used to validate the signature of the release
>> artifacts
>> > is available here:
>> > https://dist.apache.org/repos/dist/dev/madlib/KEYS
>> >
>> > To help in tallying the vote, PMC members please be sure to indicate
>> > “(binding)” with the vote.
>> >
>> > [ ] +1 approve
>> > [ ] +0 no opinion
>> > [ ] -1 disapprove (and reason why)
>> >
>> > Best regards,
>> > Orhan Kislal <ok...@apache.org>
>> >
>>
>
>
> --
> Thanks,
> Nikhil Kak
>

Re: [VOTE] MADlib v1.17.0-rc2

Posted by Nikhil Kak <nk...@pivotal.io>.
+1 binding

Successfully installed madlib and ran install-check and dev check with
1. dmg on macOS High Sierra 10.13.6 with gpdb 6.6
2. src tar gz compiled with gcc 6.5.0 on macOS High Sierra 10.13.6 with
gpdb 6.6

- Nikhil

On Wed, Apr 8, 2020 at 4:53 PM Ekta Khanna <ek...@pivotal.io> wrote:

> +1
>
> Tested on OSX (10.13.6) with GPDB 5.24.0, GPDB 6.5.0 and PG11.6
> - passed install-check, dev-check and unit-test
>
> Tested on Ubuntu 18.04 with GPDB 6.5.0
> - passed install-check, dev-check and unit-test
>
> LGTM!
>
> Thanks,
> Ekta
>
> On Mon, Apr 6, 2020 at 4:49 PM Orhan Kislal <ok...@pivotal.io> wrote:
>
> > Hello Apache MADlib community,
> >
> > This is the vote for Apache MADlib 1.17.0 Release (RC2). It provides the
> > source release tarball and convenience binaries.
> >
> > We didn't hold a vote for RC1 because we discovered a minor issue before
> > sending the vote.
> >
> > The vote will run for at least 72 hours and will close on Thursday,
> > April 9, 2020 @ 23:59 UTC (16:59 PDT). A minimum of 3 binding +1 votes
> > and more binding +1 than binding -1 are required to pass.
> >
> > The main goals of this release are:
> >
> > New features
> >     - DL: Add optional params to madlib_keras_fit_multiple_model
> > (MADLIB-1397)
> >     - DL: Fit and evaluate changes for asymmetric cluster config
> > (MADLIB-1393)
> >     - DL: Make param search fit() function work with existing evaluate
> and
> > predict (MADLIB-1387)
> >     - DL: ParamSearch: Add utility function for generating model
> selection
> > table (MADLIB-1375)
> >     - DL: Predict changes for asymmetric cluster config (MADLIB-1394)
> >     - DL: Preprocessor should evenly distribute data on an arbitrary
> > number of segments (MADLIB-1378)
> >     - DL: Preprocessor support for asymmetric segment distribution
> > (MADLIB-1392)
> >     - DL: Remove model_arch_table column from the output of
> > load_model_selection_table (MADLIB-1381)
> >     - DL: Support DL predict without training on MADlib (MADLIB-1359)
> >     - DL: Transfer learning for multi-model (MADLIB-1389)
> >     - Kmeans: Add simple silhouette score for every point (MADLIB-1382)
> >     - Kmeans: Select number of centroids in k-means (MADLIB-1380)
> >     - PostgreSQL 12 support (MADLIB-1391)
> >
> > Improvements:
> >     - Assoc rules: Add option to set number of posterior in association
> > rules (MADLIB-1327)
> >     - Correlation: Improve correlation and covariance memory usage with
> > large number of groups (MADLIB-1301)
> >     - DL: helper function for asymmetric cluster config (MADLIB-1390)
> >     - DL: Mini-batch preprocessor for images - performance issue
> > (MADLIB-1342)
> >     - DL: Modify warm start logic for DL to handle case of missing weight
> > (MADLIB-1400)
> >     - DL: Param search for multiple models on MPP architecture
> > (MADLIB-1386)
> >     - DL: performance improvements to fit transition function
> (MADLIB-1418)
> >     - Docs: Enhance Installation Guides (MADLIB-1399)
> >     - Graph: SSSP should not show vertices in output table that are
> > unreachable (MADLIB-1415)
> >     - Knn - add zero check and output distance array (MADLIB-1370)
> >     - LDA: Add stopping criteria on perplexity to LDA (MADLIB-1351)
> >     - Summary: Last optional param in summary errors when NULL
> > (MADLIB-1413)
> >     - Summary: Summary function has dups for MFV for approximate results
> > (MADLIB-1412)
> >     - SVM: Change default num_components for SVM to max(100,
> > 2*num_features) (MADLIB-1384)
> >
> > Bug fixes:
> >     - DL: Deep Learning module does not work with tables in non-public
> > schemas (MADLIB-1388)
> >     - DL: Exception during madlib_keras_fit when model_arch_id is passed
> > as NULL (MADLIB-1371)
> >     - DL: fit and fit multiple fail with memory exception in gpdb6
> > (MADLIB-1405)
> >     - DL: fit multiple takes up unnecessary disk space (MADLIB-1406)
> >     - DL: Intermediate tables are not dropped  (MADLIB-1404)
> >     - DL: MADlib Keras operations create too many threads (MADLIB-1372)
> >     - DL: metrics_elapsed_time for fit multi_model not captured correctly
> > (MADLIB-1403)
> >     - DL: predict fails with OOM in gpdb6 (MADLIB-1414)
> >     - DL: Remove final function for fit multiple (MADLIB-1416)
> >     - DL: Support schema qualified output tables for fit and fit_multiple
> > (MADLIB-1417)
> >     - Graph: APSP fails if both vertex id column and edge src column has
> > the same name (MADLIB-1407)
> >     - Graph: ASPS Path Function fails if src or dest column type is
> bigint
> > (MADLIB-1408)
> >     - Graph: Graph/wcc fails if the user specifies a schema for the
> output
> > table (MADLIB-1411)
> >     - Kmeans: k-means related functions must use same default distance
> > function (MADLIB-1383)
> >     - LDA: Term frequency and LDA - turn off notices (MADLIB-1395)
> >     - MADlib cannot be built on PowerPC machines with Linux (MADLIB-1410)
> >     - Pivot:  Pivot documentation should say "out_table" instead of
> > "output_table" (MADLIB-1376)
> >
> > Other:
> >     - DL: Support up to Keras version 2.2.4, Tensorflow version 1.14
> >     - DL: If 'madlib_keras_fit_multiple_model()' is running on GPDB 5 and
> > some versions of GPDB 6, the database will keep adding to the disk space
> > (in proportion to model size) and will only release the disk space once
> the
> > fit multiple query has completed execution. This is not the case for GPDB
> > 6.5.0+ where disk space is released during the fit multiple query.
> >     - DL: CUDA GPU memory cannot be released until the process holding it
> > is terminated.  This process holds the GPU memory until one of the
> > following two things happen: query finishes and user logs out of the
> > Postgres client/session; or, query finishes and user waits for the
> timeout
> > set by `gp_vmem_idle_resource_timeout`. The default value for this
> timeout
> > in Greenplum is 18 sec, but it can be changed.
> >     - DL: pg_temp is not allowed as an output table schema for
> > madlib_keras_fit_multiple_model().
> >     - Build: Enable current versions of bison
> >     - Build: Add cmake variable for gppkg filename
> >     - Build: Add pull request template
> >
> > 1.17.0 docs available here:
> > http://madlib.apache.org/docs/rc/index.html
> >
> > For additional information, please see:
> > https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.17.0
> >
> > Here are the release artifact details:
> >
> > Source release tag to be voted on: rc/1.17.0-rc2, located here:
> >
> >
> https://git-wip-us.apache.org/repos/asf?p=madlib.git;a=tag;h=refs/tags/rc/1.17.0-rc2
> >
> > Source release tarball can be retrieved from the following locations:
> > Package:
> >
> >
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz
> > PGP Signature:
> >
> >
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.asc
> > SHA512 Hash:
> >
> >
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.sha512
> >
> > Convenience binary packages can be retrieved from the following
> > locations:
> >
> > macOS: 10.14 GPDB 5.* & 6.*, PostgreSQL 11 & 12
> >
> > Package:
> >
> >
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg
> > PGP Signature:
> >
> >
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.asc
> > SHA512 Hash:
> >
> >
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.sha512
> >
> > CentOS 5.* GPDB 4.3.5+ (compiled with gcc 4.4)
> >
> > Package:
> >
> >
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm
> > PGP Signature:
> >
> >
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.asc
> > SHA512 Hash:
> >
> >
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.sha512
> >
> > CentOS 6 (tested on CentOS 7 as well), GPDB 5.* & 6.*, PostgreSQL 11 & 12
> > (compiled with gcc 6.2)
> >
> > Package:
> >
> >
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm
> > PGP Signature:
> >
> >
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.asc
> > SHA512 Hash:
> >
> >
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.sha512
> >
> > Ubuntu 18.04 GPDB 5.* & 6.*,PostgreSQL 11 & 12 (compiled with gcc 7.4)
> >
> > Package:
> >
> >
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb
> > PGP Signature:
> >
> >
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.asc
> > SHA512 Hash:
> >
> >
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.sha512
> >
> > The PGP KEYS file used to validate the signature of the release artifacts
> > is available here:
> > https://dist.apache.org/repos/dist/dev/madlib/KEYS
> >
> > To help in tallying the vote, PMC members please be sure to indicate
> > “(binding)” with the vote.
> >
> > [ ] +1 approve
> > [ ] +0 no opinion
> > [ ] -1 disapprove (and reason why)
> >
> > Best regards,
> > Orhan Kislal <ok...@apache.org>
> >
>


-- 
Thanks,
Nikhil Kak

Re: [VOTE] MADlib v1.17.0-rc2

Posted by Nikhil Kak <nk...@pivotal.io>.
+1 binding

Successfully installed madlib and ran install-check and dev check with
1. dmg on macOS High Sierra 10.13.6 with gpdb 6.6
2. src tar gz compiled with gcc 6.5.0 on macOS High Sierra 10.13.6 with
gpdb 6.6

- Nikhil

On Wed, Apr 8, 2020 at 4:53 PM Ekta Khanna <ek...@pivotal.io> wrote:

> +1
>
> Tested on OSX (10.13.6) with GPDB 5.24.0, GPDB 6.5.0 and PG11.6
> - passed install-check, dev-check and unit-test
>
> Tested on Ubuntu 18.04 with GPDB 6.5.0
> - passed install-check, dev-check and unit-test
>
> LGTM!
>
> Thanks,
> Ekta
>
> On Mon, Apr 6, 2020 at 4:49 PM Orhan Kislal <ok...@pivotal.io> wrote:
>
> > Hello Apache MADlib community,
> >
> > This is the vote for Apache MADlib 1.17.0 Release (RC2). It provides the
> > source release tarball and convenience binaries.
> >
> > We didn't hold a vote for RC1 because we discovered a minor issue before
> > sending the vote.
> >
> > The vote will run for at least 72 hours and will close on Thursday,
> > April 9, 2020 @ 23:59 UTC (16:59 PDT). A minimum of 3 binding +1 votes
> > and more binding +1 than binding -1 are required to pass.
> >
> > The main goals of this release are:
> >
> > New features
> >     - DL: Add optional params to madlib_keras_fit_multiple_model
> > (MADLIB-1397)
> >     - DL: Fit and evaluate changes for asymmetric cluster config
> > (MADLIB-1393)
> >     - DL: Make param search fit() function work with existing evaluate
> and
> > predict (MADLIB-1387)
> >     - DL: ParamSearch: Add utility function for generating model
> selection
> > table (MADLIB-1375)
> >     - DL: Predict changes for asymmetric cluster config (MADLIB-1394)
> >     - DL: Preprocessor should evenly distribute data on an arbitrary
> > number of segments (MADLIB-1378)
> >     - DL: Preprocessor support for asymmetric segment distribution
> > (MADLIB-1392)
> >     - DL: Remove model_arch_table column from the output of
> > load_model_selection_table (MADLIB-1381)
> >     - DL: Support DL predict without training on MADlib (MADLIB-1359)
> >     - DL: Transfer learning for multi-model (MADLIB-1389)
> >     - Kmeans: Add simple silhouette score for every point (MADLIB-1382)
> >     - Kmeans: Select number of centroids in k-means (MADLIB-1380)
> >     - PostgreSQL 12 support (MADLIB-1391)
> >
> > Improvements:
> >     - Assoc rules: Add option to set number of posterior in association
> > rules (MADLIB-1327)
> >     - Correlation: Improve correlation and covariance memory usage with
> > large number of groups (MADLIB-1301)
> >     - DL: helper function for asymmetric cluster config (MADLIB-1390)
> >     - DL: Mini-batch preprocessor for images - performance issue
> > (MADLIB-1342)
> >     - DL: Modify warm start logic for DL to handle case of missing weight
> > (MADLIB-1400)
> >     - DL: Param search for multiple models on MPP architecture
> > (MADLIB-1386)
> >     - DL: performance improvements to fit transition function
> (MADLIB-1418)
> >     - Docs: Enhance Installation Guides (MADLIB-1399)
> >     - Graph: SSSP should not show vertices in output table that are
> > unreachable (MADLIB-1415)
> >     - Knn - add zero check and output distance array (MADLIB-1370)
> >     - LDA: Add stopping criteria on perplexity to LDA (MADLIB-1351)
> >     - Summary: Last optional param in summary errors when NULL
> > (MADLIB-1413)
> >     - Summary: Summary function has dups for MFV for approximate results
> > (MADLIB-1412)
> >     - SVM: Change default num_components for SVM to max(100,
> > 2*num_features) (MADLIB-1384)
> >
> > Bug fixes:
> >     - DL: Deep Learning module does not work with tables in non-public
> > schemas (MADLIB-1388)
> >     - DL: Exception during madlib_keras_fit when model_arch_id is passed
> > as NULL (MADLIB-1371)
> >     - DL: fit and fit multiple fail with memory exception in gpdb6
> > (MADLIB-1405)
> >     - DL: fit multiple takes up unnecessary disk space (MADLIB-1406)
> >     - DL: Intermediate tables are not dropped  (MADLIB-1404)
> >     - DL: MADlib Keras operations create too many threads (MADLIB-1372)
> >     - DL: metrics_elapsed_time for fit multi_model not captured correctly
> > (MADLIB-1403)
> >     - DL: predict fails with OOM in gpdb6 (MADLIB-1414)
> >     - DL: Remove final function for fit multiple (MADLIB-1416)
> >     - DL: Support schema qualified output tables for fit and fit_multiple
> > (MADLIB-1417)
> >     - Graph: APSP fails if both vertex id column and edge src column has
> > the same name (MADLIB-1407)
> >     - Graph: ASPS Path Function fails if src or dest column type is
> bigint
> > (MADLIB-1408)
> >     - Graph: Graph/wcc fails if the user specifies a schema for the
> output
> > table (MADLIB-1411)
> >     - Kmeans: k-means related functions must use same default distance
> > function (MADLIB-1383)
> >     - LDA: Term frequency and LDA - turn off notices (MADLIB-1395)
> >     - MADlib cannot be built on PowerPC machines with Linux (MADLIB-1410)
> >     - Pivot:  Pivot documentation should say "out_table" instead of
> > "output_table" (MADLIB-1376)
> >
> > Other:
> >     - DL: Support up to Keras version 2.2.4, Tensorflow version 1.14
> >     - DL: If 'madlib_keras_fit_multiple_model()' is running on GPDB 5 and
> > some versions of GPDB 6, the database will keep adding to the disk space
> > (in proportion to model size) and will only release the disk space once
> the
> > fit multiple query has completed execution. This is not the case for GPDB
> > 6.5.0+ where disk space is released during the fit multiple query.
> >     - DL: CUDA GPU memory cannot be released until the process holding it
> > is terminated.  This process holds the GPU memory until one of the
> > following two things happen: query finishes and user logs out of the
> > Postgres client/session; or, query finishes and user waits for the
> timeout
> > set by `gp_vmem_idle_resource_timeout`. The default value for this
> timeout
> > in Greenplum is 18 sec, but it can be changed.
> >     - DL: pg_temp is not allowed as an output table schema for
> > madlib_keras_fit_multiple_model().
> >     - Build: Enable current versions of bison
> >     - Build: Add cmake variable for gppkg filename
> >     - Build: Add pull request template
> >
> > 1.17.0 docs available here:
> > http://madlib.apache.org/docs/rc/index.html
> >
> > For additional information, please see:
> > https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.17.0
> >
> > Here are the release artifact details:
> >
> > Source release tag to be voted on: rc/1.17.0-rc2, located here:
> >
> >
> https://git-wip-us.apache.org/repos/asf?p=madlib.git;a=tag;h=refs/tags/rc/1.17.0-rc2
> >
> > Source release tarball can be retrieved from the following locations:
> > Package:
> >
> >
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz
> > PGP Signature:
> >
> >
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.asc
> > SHA512 Hash:
> >
> >
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.sha512
> >
> > Convenience binary packages can be retrieved from the following
> > locations:
> >
> > macOS: 10.14 GPDB 5.* & 6.*, PostgreSQL 11 & 12
> >
> > Package:
> >
> >
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg
> > PGP Signature:
> >
> >
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.asc
> > SHA512 Hash:
> >
> >
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.sha512
> >
> > CentOS 5.* GPDB 4.3.5+ (compiled with gcc 4.4)
> >
> > Package:
> >
> >
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm
> > PGP Signature:
> >
> >
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.asc
> > SHA512 Hash:
> >
> >
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.sha512
> >
> > CentOS 6 (tested on CentOS 7 as well), GPDB 5.* & 6.*, PostgreSQL 11 & 12
> > (compiled with gcc 6.2)
> >
> > Package:
> >
> >
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm
> > PGP Signature:
> >
> >
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.asc
> > SHA512 Hash:
> >
> >
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.sha512
> >
> > Ubuntu 18.04 GPDB 5.* & 6.*,PostgreSQL 11 & 12 (compiled with gcc 7.4)
> >
> > Package:
> >
> >
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb
> > PGP Signature:
> >
> >
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.asc
> > SHA512 Hash:
> >
> >
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.sha512
> >
> > The PGP KEYS file used to validate the signature of the release artifacts
> > is available here:
> > https://dist.apache.org/repos/dist/dev/madlib/KEYS
> >
> > To help in tallying the vote, PMC members please be sure to indicate
> > “(binding)” with the vote.
> >
> > [ ] +1 approve
> > [ ] +0 no opinion
> > [ ] -1 disapprove (and reason why)
> >
> > Best regards,
> > Orhan Kislal <ok...@apache.org>
> >
>


-- 
Thanks,
Nikhil Kak

Re: [VOTE] MADlib v1.17.0-rc2

Posted by Ekta Khanna <ek...@pivotal.io>.
+1

Tested on OSX (10.13.6) with GPDB 5.24.0, GPDB 6.5.0 and PG11.6
- passed install-check, dev-check and unit-test

Tested on Ubuntu 18.04 with GPDB 6.5.0
- passed install-check, dev-check and unit-test

LGTM!

Thanks,
Ekta

On Mon, Apr 6, 2020 at 4:49 PM Orhan Kislal <ok...@pivotal.io> wrote:

> Hello Apache MADlib community,
>
> This is the vote for Apache MADlib 1.17.0 Release (RC2). It provides the
> source release tarball and convenience binaries.
>
> We didn't hold a vote for RC1 because we discovered a minor issue before
> sending the vote.
>
> The vote will run for at least 72 hours and will close on Thursday,
> April 9, 2020 @ 23:59 UTC (16:59 PDT). A minimum of 3 binding +1 votes
> and more binding +1 than binding -1 are required to pass.
>
> The main goals of this release are:
>
> New features
>     - DL: Add optional params to madlib_keras_fit_multiple_model
> (MADLIB-1397)
>     - DL: Fit and evaluate changes for asymmetric cluster config
> (MADLIB-1393)
>     - DL: Make param search fit() function work with existing evaluate and
> predict (MADLIB-1387)
>     - DL: ParamSearch: Add utility function for generating model selection
> table (MADLIB-1375)
>     - DL: Predict changes for asymmetric cluster config (MADLIB-1394)
>     - DL: Preprocessor should evenly distribute data on an arbitrary
> number of segments (MADLIB-1378)
>     - DL: Preprocessor support for asymmetric segment distribution
> (MADLIB-1392)
>     - DL: Remove model_arch_table column from the output of
> load_model_selection_table (MADLIB-1381)
>     - DL: Support DL predict without training on MADlib (MADLIB-1359)
>     - DL: Transfer learning for multi-model (MADLIB-1389)
>     - Kmeans: Add simple silhouette score for every point (MADLIB-1382)
>     - Kmeans: Select number of centroids in k-means (MADLIB-1380)
>     - PostgreSQL 12 support (MADLIB-1391)
>
> Improvements:
>     - Assoc rules: Add option to set number of posterior in association
> rules (MADLIB-1327)
>     - Correlation: Improve correlation and covariance memory usage with
> large number of groups (MADLIB-1301)
>     - DL: helper function for asymmetric cluster config (MADLIB-1390)
>     - DL: Mini-batch preprocessor for images - performance issue
> (MADLIB-1342)
>     - DL: Modify warm start logic for DL to handle case of missing weight
> (MADLIB-1400)
>     - DL: Param search for multiple models on MPP architecture
> (MADLIB-1386)
>     - DL: performance improvements to fit transition function (MADLIB-1418)
>     - Docs: Enhance Installation Guides (MADLIB-1399)
>     - Graph: SSSP should not show vertices in output table that are
> unreachable (MADLIB-1415)
>     - Knn - add zero check and output distance array (MADLIB-1370)
>     - LDA: Add stopping criteria on perplexity to LDA (MADLIB-1351)
>     - Summary: Last optional param in summary errors when NULL
> (MADLIB-1413)
>     - Summary: Summary function has dups for MFV for approximate results
> (MADLIB-1412)
>     - SVM: Change default num_components for SVM to max(100,
> 2*num_features) (MADLIB-1384)
>
> Bug fixes:
>     - DL: Deep Learning module does not work with tables in non-public
> schemas (MADLIB-1388)
>     - DL: Exception during madlib_keras_fit when model_arch_id is passed
> as NULL (MADLIB-1371)
>     - DL: fit and fit multiple fail with memory exception in gpdb6
> (MADLIB-1405)
>     - DL: fit multiple takes up unnecessary disk space (MADLIB-1406)
>     - DL: Intermediate tables are not dropped  (MADLIB-1404)
>     - DL: MADlib Keras operations create too many threads (MADLIB-1372)
>     - DL: metrics_elapsed_time for fit multi_model not captured correctly
> (MADLIB-1403)
>     - DL: predict fails with OOM in gpdb6 (MADLIB-1414)
>     - DL: Remove final function for fit multiple (MADLIB-1416)
>     - DL: Support schema qualified output tables for fit and fit_multiple
> (MADLIB-1417)
>     - Graph: APSP fails if both vertex id column and edge src column has
> the same name (MADLIB-1407)
>     - Graph: ASPS Path Function fails if src or dest column type is bigint
> (MADLIB-1408)
>     - Graph: Graph/wcc fails if the user specifies a schema for the output
> table (MADLIB-1411)
>     - Kmeans: k-means related functions must use same default distance
> function (MADLIB-1383)
>     - LDA: Term frequency and LDA - turn off notices (MADLIB-1395)
>     - MADlib cannot be built on PowerPC machines with Linux (MADLIB-1410)
>     - Pivot:  Pivot documentation should say "out_table" instead of
> "output_table" (MADLIB-1376)
>
> Other:
>     - DL: Support up to Keras version 2.2.4, Tensorflow version 1.14
>     - DL: If 'madlib_keras_fit_multiple_model()' is running on GPDB 5 and
> some versions of GPDB 6, the database will keep adding to the disk space
> (in proportion to model size) and will only release the disk space once the
> fit multiple query has completed execution. This is not the case for GPDB
> 6.5.0+ where disk space is released during the fit multiple query.
>     - DL: CUDA GPU memory cannot be released until the process holding it
> is terminated.  This process holds the GPU memory until one of the
> following two things happen: query finishes and user logs out of the
> Postgres client/session; or, query finishes and user waits for the timeout
> set by `gp_vmem_idle_resource_timeout`. The default value for this timeout
> in Greenplum is 18 sec, but it can be changed.
>     - DL: pg_temp is not allowed as an output table schema for
> madlib_keras_fit_multiple_model().
>     - Build: Enable current versions of bison
>     - Build: Add cmake variable for gppkg filename
>     - Build: Add pull request template
>
> 1.17.0 docs available here:
> http://madlib.apache.org/docs/rc/index.html
>
> For additional information, please see:
> https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.17.0
>
> Here are the release artifact details:
>
> Source release tag to be voted on: rc/1.17.0-rc2, located here:
>
> https://git-wip-us.apache.org/repos/asf?p=madlib.git;a=tag;h=refs/tags/rc/1.17.0-rc2
>
> Source release tarball can be retrieved from the following locations:
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.asc
> SHA512 Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.sha512
>
> Convenience binary packages can be retrieved from the following
> locations:
>
> macOS: 10.14 GPDB 5.* & 6.*, PostgreSQL 11 & 12
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.asc
> SHA512 Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.sha512
>
> CentOS 5.* GPDB 4.3.5+ (compiled with gcc 4.4)
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.asc
> SHA512 Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.sha512
>
> CentOS 6 (tested on CentOS 7 as well), GPDB 5.* & 6.*, PostgreSQL 11 & 12
> (compiled with gcc 6.2)
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.asc
> SHA512 Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.sha512
>
> Ubuntu 18.04 GPDB 5.* & 6.*,PostgreSQL 11 & 12 (compiled with gcc 7.4)
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.asc
> SHA512 Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.sha512
>
> The PGP KEYS file used to validate the signature of the release artifacts
> is available here:
> https://dist.apache.org/repos/dist/dev/madlib/KEYS
>
> To help in tallying the vote, PMC members please be sure to indicate
> “(binding)” with the vote.
>
> [ ] +1 approve
> [ ] +0 no opinion
> [ ] -1 disapprove (and reason why)
>
> Best regards,
> Orhan Kislal <ok...@apache.org>
>

Re: [VOTE] MADlib v1.17.0-rc2

Posted by Orhan Kislal <ok...@pivotal.io>.
Hello Apache MADlib community,

The vote for releasing Apache MADlib 1.17.0 (RC2) passed with 5 binding
+1s, 2 non-binding +1, and no 0 or -1 votes.

Below is a summary of the voting:

*Binding (PMC members) +1s (5):*
Frank McQuillan
Nikhil Kak
Xixuan (Aaron) Feng
Nandish Jayaram
Xiaocheng Tang

*Non-binding (non-PMC members) +1s (2):*
Ekta Khanna
Domino Valdano

Official vote thread:
https://lists.apache.org/thread.html/r01f12d62f67a914da4fe8b36ae0fcfc4756e833ef409d94bd0030a27%40%3Cdev.madlib.apache.org%3E

Thanks to all for taking the time to review and vote! We will now
update necessary links/files to proceed with the release.

Best,

Orhan Kislal

On Wed, Apr 8, 2020 at 9:17 PM Domino Valdano <dv...@pivotal.io> wrote:

> Tested on OSX 10.14.6 (Mojave) with gpdb5 assert build
>
> Installs and passes most of dev-check, aside from a couple known failures
> due to asserts.
>
>  PostgreSQL 8.3.23 (Greenplum Database 5.23.0+dev.10.g50015fed64 build
> dev) on x86_64-apple-darwin18.7.0, compiled by GCC Apple LLVM version
> 10.0.1 (clang-1001.0.46.4), 64-bit compiled on Nov  5 2019 08:49:26 (with
> assert checking)
>
> +1  (non-binding)
>
> Domino
>
> On Mon, Apr 6, 2020 at 4:49 PM Orhan Kislal <ok...@pivotal.io> wrote:
>
>> Hello Apache MADlib community,
>>
>> This is the vote for Apache MADlib 1.17.0 Release (RC2). It provides the
>> source release tarball and convenience binaries.
>>
>> We didn't hold a vote for RC1 because we discovered a minor issue before
>> sending the vote.
>>
>> The vote will run for at least 72 hours and will close on Thursday,
>> April 9, 2020 @ 23:59 UTC (16:59 PDT). A minimum of 3 binding +1 votes
>> and more binding +1 than binding -1 are required to pass.
>>
>> The main goals of this release are:
>>
>> New features
>>     - DL: Add optional params to madlib_keras_fit_multiple_model
>> (MADLIB-1397)
>>     - DL: Fit and evaluate changes for asymmetric cluster config
>> (MADLIB-1393)
>>     - DL: Make param search fit() function work with existing evaluate
>> and predict (MADLIB-1387)
>>     - DL: ParamSearch: Add utility function for generating model
>> selection table (MADLIB-1375)
>>     - DL: Predict changes for asymmetric cluster config (MADLIB-1394)
>>     - DL: Preprocessor should evenly distribute data on an arbitrary
>> number of segments (MADLIB-1378)
>>     - DL: Preprocessor support for asymmetric segment distribution
>> (MADLIB-1392)
>>     - DL: Remove model_arch_table column from the output of
>> load_model_selection_table (MADLIB-1381)
>>     - DL: Support DL predict without training on MADlib (MADLIB-1359)
>>     - DL: Transfer learning for multi-model (MADLIB-1389)
>>     - Kmeans: Add simple silhouette score for every point (MADLIB-1382)
>>     - Kmeans: Select number of centroids in k-means (MADLIB-1380)
>>     - PostgreSQL 12 support (MADLIB-1391)
>>
>> Improvements:
>>     - Assoc rules: Add option to set number of posterior in association
>> rules (MADLIB-1327)
>>     - Correlation: Improve correlation and covariance memory usage with
>> large number of groups (MADLIB-1301)
>>     - DL: helper function for asymmetric cluster config (MADLIB-1390)
>>     - DL: Mini-batch preprocessor for images - performance issue
>> (MADLIB-1342)
>>     - DL: Modify warm start logic for DL to handle case of missing weight
>> (MADLIB-1400)
>>     - DL: Param search for multiple models on MPP architecture
>> (MADLIB-1386)
>>     - DL: performance improvements to fit transition function
>> (MADLIB-1418)
>>     - Docs: Enhance Installation Guides (MADLIB-1399)
>>     - Graph: SSSP should not show vertices in output table that are
>> unreachable (MADLIB-1415)
>>     - Knn - add zero check and output distance array (MADLIB-1370)
>>     - LDA: Add stopping criteria on perplexity to LDA (MADLIB-1351)
>>     - Summary: Last optional param in summary errors when NULL
>> (MADLIB-1413)
>>     - Summary: Summary function has dups for MFV for approximate results
>> (MADLIB-1412)
>>     - SVM: Change default num_components for SVM to max(100,
>> 2*num_features) (MADLIB-1384)
>>
>> Bug fixes:
>>     - DL: Deep Learning module does not work with tables in non-public
>> schemas (MADLIB-1388)
>>     - DL: Exception during madlib_keras_fit when model_arch_id is passed
>> as NULL (MADLIB-1371)
>>     - DL: fit and fit multiple fail with memory exception in gpdb6
>> (MADLIB-1405)
>>     - DL: fit multiple takes up unnecessary disk space (MADLIB-1406)
>>     - DL: Intermediate tables are not dropped  (MADLIB-1404)
>>     - DL: MADlib Keras operations create too many threads (MADLIB-1372)
>>     - DL: metrics_elapsed_time for fit multi_model not captured correctly
>> (MADLIB-1403)
>>     - DL: predict fails with OOM in gpdb6 (MADLIB-1414)
>>     - DL: Remove final function for fit multiple (MADLIB-1416)
>>     - DL: Support schema qualified output tables for fit and fit_multiple
>> (MADLIB-1417)
>>     - Graph: APSP fails if both vertex id column and edge src column has
>> the same name (MADLIB-1407)
>>     - Graph: ASPS Path Function fails if src or dest column type is
>> bigint (MADLIB-1408)
>>     - Graph: Graph/wcc fails if the user specifies a schema for the
>> output table (MADLIB-1411)
>>     - Kmeans: k-means related functions must use same default distance
>> function (MADLIB-1383)
>>     - LDA: Term frequency and LDA - turn off notices (MADLIB-1395)
>>     - MADlib cannot be built on PowerPC machines with Linux (MADLIB-1410)
>>     - Pivot:  Pivot documentation should say "out_table" instead of
>> "output_table" (MADLIB-1376)
>>
>> Other:
>>     - DL: Support up to Keras version 2.2.4, Tensorflow version 1.14
>>     - DL: If 'madlib_keras_fit_multiple_model()' is running on GPDB 5 and
>> some versions of GPDB 6, the database will keep adding to the disk space
>> (in proportion to model size) and will only release the disk space once the
>> fit multiple query has completed execution. This is not the case for GPDB
>> 6.5.0+ where disk space is released during the fit multiple query.
>>     - DL: CUDA GPU memory cannot be released until the process holding it
>> is terminated.  This process holds the GPU memory until one of the
>> following two things happen: query finishes and user logs out of the
>> Postgres client/session; or, query finishes and user waits for the timeout
>> set by `gp_vmem_idle_resource_timeout`. The default value for this timeout
>> in Greenplum is 18 sec, but it can be changed.
>>     - DL: pg_temp is not allowed as an output table schema for
>> madlib_keras_fit_multiple_model().
>>     - Build: Enable current versions of bison
>>     - Build: Add cmake variable for gppkg filename
>>     - Build: Add pull request template
>>
>> 1.17.0 docs available here:
>> http://madlib.apache.org/docs/rc/index.html
>>
>> For additional information, please see:
>> https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.17.0
>>
>> Here are the release artifact details:
>>
>> Source release tag to be voted on: rc/1.17.0-rc2, located here:
>>
>> https://git-wip-us.apache.org/repos/asf?p=madlib.git;a=tag;h=refs/tags/rc/1.17.0-rc2
>>
>> Source release tarball can be retrieved from the following locations:
>> Package:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz
>> PGP Signature:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.asc
>> SHA512 Hash:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.sha512
>>
>> Convenience binary packages can be retrieved from the following
>> locations:
>>
>> macOS: 10.14 GPDB 5.* & 6.*, PostgreSQL 11 & 12
>>
>> Package:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg
>> PGP Signature:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.asc
>> SHA512 Hash:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.sha512
>>
>> CentOS 5.* GPDB 4.3.5+ (compiled with gcc 4.4)
>>
>> Package:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm
>> PGP Signature:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.asc
>> SHA512 Hash:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.sha512
>>
>> CentOS 6 (tested on CentOS 7 as well), GPDB 5.* & 6.*, PostgreSQL 11 & 12
>> (compiled with gcc 6.2)
>>
>> Package:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm
>> PGP Signature:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.asc
>> SHA512 Hash:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.sha512
>>
>> Ubuntu 18.04 GPDB 5.* & 6.*,PostgreSQL 11 & 12 (compiled with gcc 7.4)
>>
>> Package:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb
>> PGP Signature:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.asc
>> SHA512 Hash:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.sha512
>>
>> The PGP KEYS file used to validate the signature of the release artifacts
>> is available here:
>> https://dist.apache.org/repos/dist/dev/madlib/KEYS
>>
>> To help in tallying the vote, PMC members please be sure to indicate
>> “(binding)” with the vote.
>>
>> [ ] +1 approve
>> [ ] +0 no opinion
>> [ ] -1 disapprove (and reason why)
>>
>> Best regards,
>> Orhan Kislal <ok...@apache.org>
>>
>
>
> --
> Domino Valdano <dv...@vmware.com>
> Pronouns:  She/Her
> VMware Staff Software Engineer
> Modern Applications Platform
>

Re: [VOTE] MADlib v1.17.0-rc2

Posted by Orhan Kislal <ok...@pivotal.io>.
Hello Apache MADlib community,

The vote for releasing Apache MADlib 1.17.0 (RC2) passed with 5 binding
+1s, 2 non-binding +1, and no 0 or -1 votes.

Below is a summary of the voting:

*Binding (PMC members) +1s (5):*
Frank McQuillan
Nikhil Kak
Xixuan (Aaron) Feng
Nandish Jayaram
Xiaocheng Tang

*Non-binding (non-PMC members) +1s (2):*
Ekta Khanna
Domino Valdano

Official vote thread:
https://lists.apache.org/thread.html/r01f12d62f67a914da4fe8b36ae0fcfc4756e833ef409d94bd0030a27%40%3Cdev.madlib.apache.org%3E

Thanks to all for taking the time to review and vote! We will now
update necessary links/files to proceed with the release.

Best,

Orhan Kislal

On Wed, Apr 8, 2020 at 9:17 PM Domino Valdano <dv...@pivotal.io> wrote:

> Tested on OSX 10.14.6 (Mojave) with gpdb5 assert build
>
> Installs and passes most of dev-check, aside from a couple known failures
> due to asserts.
>
>  PostgreSQL 8.3.23 (Greenplum Database 5.23.0+dev.10.g50015fed64 build
> dev) on x86_64-apple-darwin18.7.0, compiled by GCC Apple LLVM version
> 10.0.1 (clang-1001.0.46.4), 64-bit compiled on Nov  5 2019 08:49:26 (with
> assert checking)
>
> +1  (non-binding)
>
> Domino
>
> On Mon, Apr 6, 2020 at 4:49 PM Orhan Kislal <ok...@pivotal.io> wrote:
>
>> Hello Apache MADlib community,
>>
>> This is the vote for Apache MADlib 1.17.0 Release (RC2). It provides the
>> source release tarball and convenience binaries.
>>
>> We didn't hold a vote for RC1 because we discovered a minor issue before
>> sending the vote.
>>
>> The vote will run for at least 72 hours and will close on Thursday,
>> April 9, 2020 @ 23:59 UTC (16:59 PDT). A minimum of 3 binding +1 votes
>> and more binding +1 than binding -1 are required to pass.
>>
>> The main goals of this release are:
>>
>> New features
>>     - DL: Add optional params to madlib_keras_fit_multiple_model
>> (MADLIB-1397)
>>     - DL: Fit and evaluate changes for asymmetric cluster config
>> (MADLIB-1393)
>>     - DL: Make param search fit() function work with existing evaluate
>> and predict (MADLIB-1387)
>>     - DL: ParamSearch: Add utility function for generating model
>> selection table (MADLIB-1375)
>>     - DL: Predict changes for asymmetric cluster config (MADLIB-1394)
>>     - DL: Preprocessor should evenly distribute data on an arbitrary
>> number of segments (MADLIB-1378)
>>     - DL: Preprocessor support for asymmetric segment distribution
>> (MADLIB-1392)
>>     - DL: Remove model_arch_table column from the output of
>> load_model_selection_table (MADLIB-1381)
>>     - DL: Support DL predict without training on MADlib (MADLIB-1359)
>>     - DL: Transfer learning for multi-model (MADLIB-1389)
>>     - Kmeans: Add simple silhouette score for every point (MADLIB-1382)
>>     - Kmeans: Select number of centroids in k-means (MADLIB-1380)
>>     - PostgreSQL 12 support (MADLIB-1391)
>>
>> Improvements:
>>     - Assoc rules: Add option to set number of posterior in association
>> rules (MADLIB-1327)
>>     - Correlation: Improve correlation and covariance memory usage with
>> large number of groups (MADLIB-1301)
>>     - DL: helper function for asymmetric cluster config (MADLIB-1390)
>>     - DL: Mini-batch preprocessor for images - performance issue
>> (MADLIB-1342)
>>     - DL: Modify warm start logic for DL to handle case of missing weight
>> (MADLIB-1400)
>>     - DL: Param search for multiple models on MPP architecture
>> (MADLIB-1386)
>>     - DL: performance improvements to fit transition function
>> (MADLIB-1418)
>>     - Docs: Enhance Installation Guides (MADLIB-1399)
>>     - Graph: SSSP should not show vertices in output table that are
>> unreachable (MADLIB-1415)
>>     - Knn - add zero check and output distance array (MADLIB-1370)
>>     - LDA: Add stopping criteria on perplexity to LDA (MADLIB-1351)
>>     - Summary: Last optional param in summary errors when NULL
>> (MADLIB-1413)
>>     - Summary: Summary function has dups for MFV for approximate results
>> (MADLIB-1412)
>>     - SVM: Change default num_components for SVM to max(100,
>> 2*num_features) (MADLIB-1384)
>>
>> Bug fixes:
>>     - DL: Deep Learning module does not work with tables in non-public
>> schemas (MADLIB-1388)
>>     - DL: Exception during madlib_keras_fit when model_arch_id is passed
>> as NULL (MADLIB-1371)
>>     - DL: fit and fit multiple fail with memory exception in gpdb6
>> (MADLIB-1405)
>>     - DL: fit multiple takes up unnecessary disk space (MADLIB-1406)
>>     - DL: Intermediate tables are not dropped  (MADLIB-1404)
>>     - DL: MADlib Keras operations create too many threads (MADLIB-1372)
>>     - DL: metrics_elapsed_time for fit multi_model not captured correctly
>> (MADLIB-1403)
>>     - DL: predict fails with OOM in gpdb6 (MADLIB-1414)
>>     - DL: Remove final function for fit multiple (MADLIB-1416)
>>     - DL: Support schema qualified output tables for fit and fit_multiple
>> (MADLIB-1417)
>>     - Graph: APSP fails if both vertex id column and edge src column has
>> the same name (MADLIB-1407)
>>     - Graph: ASPS Path Function fails if src or dest column type is
>> bigint (MADLIB-1408)
>>     - Graph: Graph/wcc fails if the user specifies a schema for the
>> output table (MADLIB-1411)
>>     - Kmeans: k-means related functions must use same default distance
>> function (MADLIB-1383)
>>     - LDA: Term frequency and LDA - turn off notices (MADLIB-1395)
>>     - MADlib cannot be built on PowerPC machines with Linux (MADLIB-1410)
>>     - Pivot:  Pivot documentation should say "out_table" instead of
>> "output_table" (MADLIB-1376)
>>
>> Other:
>>     - DL: Support up to Keras version 2.2.4, Tensorflow version 1.14
>>     - DL: If 'madlib_keras_fit_multiple_model()' is running on GPDB 5 and
>> some versions of GPDB 6, the database will keep adding to the disk space
>> (in proportion to model size) and will only release the disk space once the
>> fit multiple query has completed execution. This is not the case for GPDB
>> 6.5.0+ where disk space is released during the fit multiple query.
>>     - DL: CUDA GPU memory cannot be released until the process holding it
>> is terminated.  This process holds the GPU memory until one of the
>> following two things happen: query finishes and user logs out of the
>> Postgres client/session; or, query finishes and user waits for the timeout
>> set by `gp_vmem_idle_resource_timeout`. The default value for this timeout
>> in Greenplum is 18 sec, but it can be changed.
>>     - DL: pg_temp is not allowed as an output table schema for
>> madlib_keras_fit_multiple_model().
>>     - Build: Enable current versions of bison
>>     - Build: Add cmake variable for gppkg filename
>>     - Build: Add pull request template
>>
>> 1.17.0 docs available here:
>> http://madlib.apache.org/docs/rc/index.html
>>
>> For additional information, please see:
>> https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.17.0
>>
>> Here are the release artifact details:
>>
>> Source release tag to be voted on: rc/1.17.0-rc2, located here:
>>
>> https://git-wip-us.apache.org/repos/asf?p=madlib.git;a=tag;h=refs/tags/rc/1.17.0-rc2
>>
>> Source release tarball can be retrieved from the following locations:
>> Package:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz
>> PGP Signature:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.asc
>> SHA512 Hash:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.sha512
>>
>> Convenience binary packages can be retrieved from the following
>> locations:
>>
>> macOS: 10.14 GPDB 5.* & 6.*, PostgreSQL 11 & 12
>>
>> Package:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg
>> PGP Signature:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.asc
>> SHA512 Hash:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.sha512
>>
>> CentOS 5.* GPDB 4.3.5+ (compiled with gcc 4.4)
>>
>> Package:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm
>> PGP Signature:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.asc
>> SHA512 Hash:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.sha512
>>
>> CentOS 6 (tested on CentOS 7 as well), GPDB 5.* & 6.*, PostgreSQL 11 & 12
>> (compiled with gcc 6.2)
>>
>> Package:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm
>> PGP Signature:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.asc
>> SHA512 Hash:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.sha512
>>
>> Ubuntu 18.04 GPDB 5.* & 6.*,PostgreSQL 11 & 12 (compiled with gcc 7.4)
>>
>> Package:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb
>> PGP Signature:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.asc
>> SHA512 Hash:
>>
>> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.sha512
>>
>> The PGP KEYS file used to validate the signature of the release artifacts
>> is available here:
>> https://dist.apache.org/repos/dist/dev/madlib/KEYS
>>
>> To help in tallying the vote, PMC members please be sure to indicate
>> “(binding)” with the vote.
>>
>> [ ] +1 approve
>> [ ] +0 no opinion
>> [ ] -1 disapprove (and reason why)
>>
>> Best regards,
>> Orhan Kislal <ok...@apache.org>
>>
>
>
> --
> Domino Valdano <dv...@vmware.com>
> Pronouns:  She/Her
> VMware Staff Software Engineer
> Modern Applications Platform
>

Re: [VOTE] MADlib v1.17.0-rc2

Posted by Domino Valdano <dv...@pivotal.io>.
Tested on OSX 10.14.6 (Mojave) with gpdb5 assert build

Installs and passes most of dev-check, aside from a couple known failures
due to asserts.

 PostgreSQL 8.3.23 (Greenplum Database 5.23.0+dev.10.g50015fed64 build dev)
on x86_64-apple-darwin18.7.0, compiled by GCC Apple LLVM version 10.0.1
(clang-1001.0.46.4), 64-bit compiled on Nov  5 2019 08:49:26 (with assert
checking)

+1  (non-binding)

Domino

On Mon, Apr 6, 2020 at 4:49 PM Orhan Kislal <ok...@pivotal.io> wrote:

> Hello Apache MADlib community,
>
> This is the vote for Apache MADlib 1.17.0 Release (RC2). It provides the
> source release tarball and convenience binaries.
>
> We didn't hold a vote for RC1 because we discovered a minor issue before
> sending the vote.
>
> The vote will run for at least 72 hours and will close on Thursday,
> April 9, 2020 @ 23:59 UTC (16:59 PDT). A minimum of 3 binding +1 votes
> and more binding +1 than binding -1 are required to pass.
>
> The main goals of this release are:
>
> New features
>     - DL: Add optional params to madlib_keras_fit_multiple_model
> (MADLIB-1397)
>     - DL: Fit and evaluate changes for asymmetric cluster config
> (MADLIB-1393)
>     - DL: Make param search fit() function work with existing evaluate and
> predict (MADLIB-1387)
>     - DL: ParamSearch: Add utility function for generating model selection
> table (MADLIB-1375)
>     - DL: Predict changes for asymmetric cluster config (MADLIB-1394)
>     - DL: Preprocessor should evenly distribute data on an arbitrary
> number of segments (MADLIB-1378)
>     - DL: Preprocessor support for asymmetric segment distribution
> (MADLIB-1392)
>     - DL: Remove model_arch_table column from the output of
> load_model_selection_table (MADLIB-1381)
>     - DL: Support DL predict without training on MADlib (MADLIB-1359)
>     - DL: Transfer learning for multi-model (MADLIB-1389)
>     - Kmeans: Add simple silhouette score for every point (MADLIB-1382)
>     - Kmeans: Select number of centroids in k-means (MADLIB-1380)
>     - PostgreSQL 12 support (MADLIB-1391)
>
> Improvements:
>     - Assoc rules: Add option to set number of posterior in association
> rules (MADLIB-1327)
>     - Correlation: Improve correlation and covariance memory usage with
> large number of groups (MADLIB-1301)
>     - DL: helper function for asymmetric cluster config (MADLIB-1390)
>     - DL: Mini-batch preprocessor for images - performance issue
> (MADLIB-1342)
>     - DL: Modify warm start logic for DL to handle case of missing weight
> (MADLIB-1400)
>     - DL: Param search for multiple models on MPP architecture
> (MADLIB-1386)
>     - DL: performance improvements to fit transition function (MADLIB-1418)
>     - Docs: Enhance Installation Guides (MADLIB-1399)
>     - Graph: SSSP should not show vertices in output table that are
> unreachable (MADLIB-1415)
>     - Knn - add zero check and output distance array (MADLIB-1370)
>     - LDA: Add stopping criteria on perplexity to LDA (MADLIB-1351)
>     - Summary: Last optional param in summary errors when NULL
> (MADLIB-1413)
>     - Summary: Summary function has dups for MFV for approximate results
> (MADLIB-1412)
>     - SVM: Change default num_components for SVM to max(100,
> 2*num_features) (MADLIB-1384)
>
> Bug fixes:
>     - DL: Deep Learning module does not work with tables in non-public
> schemas (MADLIB-1388)
>     - DL: Exception during madlib_keras_fit when model_arch_id is passed
> as NULL (MADLIB-1371)
>     - DL: fit and fit multiple fail with memory exception in gpdb6
> (MADLIB-1405)
>     - DL: fit multiple takes up unnecessary disk space (MADLIB-1406)
>     - DL: Intermediate tables are not dropped  (MADLIB-1404)
>     - DL: MADlib Keras operations create too many threads (MADLIB-1372)
>     - DL: metrics_elapsed_time for fit multi_model not captured correctly
> (MADLIB-1403)
>     - DL: predict fails with OOM in gpdb6 (MADLIB-1414)
>     - DL: Remove final function for fit multiple (MADLIB-1416)
>     - DL: Support schema qualified output tables for fit and fit_multiple
> (MADLIB-1417)
>     - Graph: APSP fails if both vertex id column and edge src column has
> the same name (MADLIB-1407)
>     - Graph: ASPS Path Function fails if src or dest column type is bigint
> (MADLIB-1408)
>     - Graph: Graph/wcc fails if the user specifies a schema for the output
> table (MADLIB-1411)
>     - Kmeans: k-means related functions must use same default distance
> function (MADLIB-1383)
>     - LDA: Term frequency and LDA - turn off notices (MADLIB-1395)
>     - MADlib cannot be built on PowerPC machines with Linux (MADLIB-1410)
>     - Pivot:  Pivot documentation should say "out_table" instead of
> "output_table" (MADLIB-1376)
>
> Other:
>     - DL: Support up to Keras version 2.2.4, Tensorflow version 1.14
>     - DL: If 'madlib_keras_fit_multiple_model()' is running on GPDB 5 and
> some versions of GPDB 6, the database will keep adding to the disk space
> (in proportion to model size) and will only release the disk space once the
> fit multiple query has completed execution. This is not the case for GPDB
> 6.5.0+ where disk space is released during the fit multiple query.
>     - DL: CUDA GPU memory cannot be released until the process holding it
> is terminated.  This process holds the GPU memory until one of the
> following two things happen: query finishes and user logs out of the
> Postgres client/session; or, query finishes and user waits for the timeout
> set by `gp_vmem_idle_resource_timeout`. The default value for this timeout
> in Greenplum is 18 sec, but it can be changed.
>     - DL: pg_temp is not allowed as an output table schema for
> madlib_keras_fit_multiple_model().
>     - Build: Enable current versions of bison
>     - Build: Add cmake variable for gppkg filename
>     - Build: Add pull request template
>
> 1.17.0 docs available here:
> http://madlib.apache.org/docs/rc/index.html
>
> For additional information, please see:
> https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.17.0
>
> Here are the release artifact details:
>
> Source release tag to be voted on: rc/1.17.0-rc2, located here:
>
> https://git-wip-us.apache.org/repos/asf?p=madlib.git;a=tag;h=refs/tags/rc/1.17.0-rc2
>
> Source release tarball can be retrieved from the following locations:
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.asc
> SHA512 Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.sha512
>
> Convenience binary packages can be retrieved from the following
> locations:
>
> macOS: 10.14 GPDB 5.* & 6.*, PostgreSQL 11 & 12
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.asc
> SHA512 Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.sha512
>
> CentOS 5.* GPDB 4.3.5+ (compiled with gcc 4.4)
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.asc
> SHA512 Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.sha512
>
> CentOS 6 (tested on CentOS 7 as well), GPDB 5.* & 6.*, PostgreSQL 11 & 12
> (compiled with gcc 6.2)
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.asc
> SHA512 Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.sha512
>
> Ubuntu 18.04 GPDB 5.* & 6.*,PostgreSQL 11 & 12 (compiled with gcc 7.4)
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.asc
> SHA512 Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.sha512
>
> The PGP KEYS file used to validate the signature of the release artifacts
> is available here:
> https://dist.apache.org/repos/dist/dev/madlib/KEYS
>
> To help in tallying the vote, PMC members please be sure to indicate
> “(binding)” with the vote.
>
> [ ] +1 approve
> [ ] +0 no opinion
> [ ] -1 disapprove (and reason why)
>
> Best regards,
> Orhan Kislal <ok...@apache.org>
>


-- 
Domino Valdano <dv...@vmware.com>
Pronouns:  She/Her
VMware Staff Software Engineer
Modern Applications Platform

Re: [VOTE] MADlib v1.17.0-rc2

Posted by Frank McQuillan <fm...@pivotal.io>.
+1 (binding)

tested on pg11.3 on osx
- passed install check, dev check, spot check of some new 1.17 functions

tested on gp5.18
- passed spot check of some new 1.17 functions

well done!

On Mon, Apr 6, 2020 at 4:49 PM Orhan Kislal <ok...@pivotal.io> wrote:

> Hello Apache MADlib community,
>
> This is the vote for Apache MADlib 1.17.0 Release (RC2). It provides the
> source release tarball and convenience binaries.
>
> We didn't hold a vote for RC1 because we discovered a minor issue before
> sending the vote.
>
> The vote will run for at least 72 hours and will close on Thursday,
> April 9, 2020 @ 23:59 UTC (16:59 PDT). A minimum of 3 binding +1 votes
> and more binding +1 than binding -1 are required to pass.
>
> The main goals of this release are:
>
> New features
>     - DL: Add optional params to madlib_keras_fit_multiple_model
> (MADLIB-1397)
>     - DL: Fit and evaluate changes for asymmetric cluster config
> (MADLIB-1393)
>     - DL: Make param search fit() function work with existing evaluate and
> predict (MADLIB-1387)
>     - DL: ParamSearch: Add utility function for generating model selection
> table (MADLIB-1375)
>     - DL: Predict changes for asymmetric cluster config (MADLIB-1394)
>     - DL: Preprocessor should evenly distribute data on an arbitrary number
> of segments (MADLIB-1378)
>     - DL: Preprocessor support for asymmetric segment distribution
> (MADLIB-1392)
>     - DL: Remove model_arch_table column from the output of
> load_model_selection_table (MADLIB-1381)
>     - DL: Support DL predict without training on MADlib (MADLIB-1359)
>     - DL: Transfer learning for multi-model (MADLIB-1389)
>     - Kmeans: Add simple silhouette score for every point (MADLIB-1382)
>     - Kmeans: Select number of centroids in k-means (MADLIB-1380)
>     - PostgreSQL 12 support (MADLIB-1391)
>
> Improvements:
>     - Assoc rules: Add option to set number of posterior in association
> rules (MADLIB-1327)
>     - Correlation: Improve correlation and covariance memory usage with
> large number of groups (MADLIB-1301)
>     - DL: helper function for asymmetric cluster config (MADLIB-1390)
>     - DL: Mini-batch preprocessor for images - performance issue
> (MADLIB-1342)
>     - DL: Modify warm start logic for DL to handle case of missing weight
> (MADLIB-1400)
>     - DL: Param search for multiple models on MPP architecture
> (MADLIB-1386)
>     - DL: performance improvements to fit transition function (MADLIB-1418)
>     - Docs: Enhance Installation Guides (MADLIB-1399)
>     - Graph: SSSP should not show vertices in output table that are
> unreachable (MADLIB-1415)
>     - Knn - add zero check and output distance array (MADLIB-1370)
>     - LDA: Add stopping criteria on perplexity to LDA (MADLIB-1351)
>     - Summary: Last optional param in summary errors when NULL
> (MADLIB-1413)
>     - Summary: Summary function has dups for MFV for approximate results
> (MADLIB-1412)
>     - SVM: Change default num_components for SVM to max(100,
> 2*num_features) (MADLIB-1384)
>
> Bug fixes:
>     - DL: Deep Learning module does not work with tables in non-public
> schemas (MADLIB-1388)
>     - DL: Exception during madlib_keras_fit when model_arch_id is passed as
> NULL (MADLIB-1371)
>     - DL: fit and fit multiple fail with memory exception in gpdb6
> (MADLIB-1405)
>     - DL: fit multiple takes up unnecessary disk space (MADLIB-1406)
>     - DL: Intermediate tables are not dropped  (MADLIB-1404)
>     - DL: MADlib Keras operations create too many threads (MADLIB-1372)
>     - DL: metrics_elapsed_time for fit multi_model not captured correctly
> (MADLIB-1403)
>     - DL: predict fails with OOM in gpdb6 (MADLIB-1414)
>     - DL: Remove final function for fit multiple (MADLIB-1416)
>     - DL: Support schema qualified output tables for fit and fit_multiple
> (MADLIB-1417)
>     - Graph: APSP fails if both vertex id column and edge src column has
> the same name (MADLIB-1407)
>     - Graph: ASPS Path Function fails if src or dest column type is bigint
> (MADLIB-1408)
>     - Graph: Graph/wcc fails if the user specifies a schema for the output
> table (MADLIB-1411)
>     - Kmeans: k-means related functions must use same default distance
> function (MADLIB-1383)
>     - LDA: Term frequency and LDA - turn off notices (MADLIB-1395)
>     - MADlib cannot be built on PowerPC machines with Linux (MADLIB-1410)
>     - Pivot:  Pivot documentation should say "out_table" instead of
> "output_table" (MADLIB-1376)
>
> Other:
>     - DL: Support up to Keras version 2.2.4, Tensorflow version 1.14
>     - DL: If 'madlib_keras_fit_multiple_model()' is running on GPDB 5 and
> some versions of GPDB 6, the database will keep adding to the disk space
> (in proportion to model size) and will only release the disk space once the
> fit multiple query has completed execution. This is not the case for GPDB
> 6.5.0+ where disk space is released during the fit multiple query.
>     - DL: CUDA GPU memory cannot be released until the process holding it
> is terminated.  This process holds the GPU memory until one of the
> following two things happen: query finishes and user logs out of the
> Postgres client/session; or, query finishes and user waits for the timeout
> set by `gp_vmem_idle_resource_timeout`. The default value for this timeout
> in Greenplum is 18 sec, but it can be changed.
>     - DL: pg_temp is not allowed as an output table schema for
> madlib_keras_fit_multiple_model().
>     - Build: Enable current versions of bison
>     - Build: Add cmake variable for gppkg filename
>     - Build: Add pull request template
>
> 1.17.0 docs available here:
> http://madlib.apache.org/docs/rc/index.html
>
> For additional information, please see:
> https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.17.0
>
> Here are the release artifact details:
>
> Source release tag to be voted on: rc/1.17.0-rc2, located here:
>
> https://git-wip-us.apache.org/repos/asf?p=madlib.git;a=tag;h=refs/tags/rc/1.17.0-rc2
>
> Source release tarball can be retrieved from the following locations:
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.asc
> SHA512
> <https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.ascSHA512>
> Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.sha512
>
> Convenience binary packages can be retrieved from the following
> locations:
>
> macOS: 10.14 GPDB 5.* & 6.*, PostgreSQL 11 & 12
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.asc
> SHA512
> <https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.ascSHA512>
> Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.sha512
>
> CentOS 5.* GPDB 4.3.5+ (compiled with gcc 4.4)
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.asc
> SHA512
> <https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.ascSHA512>
> Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.sha512
>
> CentOS 6 (tested on CentOS 7 as well), GPDB 5.* & 6.*, PostgreSQL 11 & 12
> (compiled with gcc 6.2)
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.asc
> SHA512
> <https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.ascSHA512>
> Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.sha512
>
> Ubuntu 18.04 GPDB 5.* & 6.*,PostgreSQL 11 & 12 (compiled with gcc 7.4)
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.asc
> SHA512
> <https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.ascSHA512>
> Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.sha512
>
> The PGP KEYS file used to validate the signature of the release artifacts
> is available here:
> https://dist.apache.org/repos/dist/dev/madlib/KEYS
>
> To help in tallying the vote, PMC members please be sure to indicate
> “(binding)” with the vote.
>
> [ ] +1 approve
> [ ] +0 no opinion
> [ ] -1 disapprove (and reason why)
>
> Best regards,
> Orhan Kislal <ok...@apache.org>
>

Re: [VOTE] MADlib v1.17.0-rc2

Posted by Ekta Khanna <ek...@pivotal.io>.
+1

Tested on OSX (10.13.6) with GPDB 5.24.0, GPDB 6.5.0 and PG11.6
- passed install-check, dev-check and unit-test

Tested on Ubuntu 18.04 with GPDB 6.5.0
- passed install-check, dev-check and unit-test

LGTM!

Thanks,
Ekta

On Mon, Apr 6, 2020 at 4:49 PM Orhan Kislal <ok...@pivotal.io> wrote:

> Hello Apache MADlib community,
>
> This is the vote for Apache MADlib 1.17.0 Release (RC2). It provides the
> source release tarball and convenience binaries.
>
> We didn't hold a vote for RC1 because we discovered a minor issue before
> sending the vote.
>
> The vote will run for at least 72 hours and will close on Thursday,
> April 9, 2020 @ 23:59 UTC (16:59 PDT). A minimum of 3 binding +1 votes
> and more binding +1 than binding -1 are required to pass.
>
> The main goals of this release are:
>
> New features
>     - DL: Add optional params to madlib_keras_fit_multiple_model
> (MADLIB-1397)
>     - DL: Fit and evaluate changes for asymmetric cluster config
> (MADLIB-1393)
>     - DL: Make param search fit() function work with existing evaluate and
> predict (MADLIB-1387)
>     - DL: ParamSearch: Add utility function for generating model selection
> table (MADLIB-1375)
>     - DL: Predict changes for asymmetric cluster config (MADLIB-1394)
>     - DL: Preprocessor should evenly distribute data on an arbitrary
> number of segments (MADLIB-1378)
>     - DL: Preprocessor support for asymmetric segment distribution
> (MADLIB-1392)
>     - DL: Remove model_arch_table column from the output of
> load_model_selection_table (MADLIB-1381)
>     - DL: Support DL predict without training on MADlib (MADLIB-1359)
>     - DL: Transfer learning for multi-model (MADLIB-1389)
>     - Kmeans: Add simple silhouette score for every point (MADLIB-1382)
>     - Kmeans: Select number of centroids in k-means (MADLIB-1380)
>     - PostgreSQL 12 support (MADLIB-1391)
>
> Improvements:
>     - Assoc rules: Add option to set number of posterior in association
> rules (MADLIB-1327)
>     - Correlation: Improve correlation and covariance memory usage with
> large number of groups (MADLIB-1301)
>     - DL: helper function for asymmetric cluster config (MADLIB-1390)
>     - DL: Mini-batch preprocessor for images - performance issue
> (MADLIB-1342)
>     - DL: Modify warm start logic for DL to handle case of missing weight
> (MADLIB-1400)
>     - DL: Param search for multiple models on MPP architecture
> (MADLIB-1386)
>     - DL: performance improvements to fit transition function (MADLIB-1418)
>     - Docs: Enhance Installation Guides (MADLIB-1399)
>     - Graph: SSSP should not show vertices in output table that are
> unreachable (MADLIB-1415)
>     - Knn - add zero check and output distance array (MADLIB-1370)
>     - LDA: Add stopping criteria on perplexity to LDA (MADLIB-1351)
>     - Summary: Last optional param in summary errors when NULL
> (MADLIB-1413)
>     - Summary: Summary function has dups for MFV for approximate results
> (MADLIB-1412)
>     - SVM: Change default num_components for SVM to max(100,
> 2*num_features) (MADLIB-1384)
>
> Bug fixes:
>     - DL: Deep Learning module does not work with tables in non-public
> schemas (MADLIB-1388)
>     - DL: Exception during madlib_keras_fit when model_arch_id is passed
> as NULL (MADLIB-1371)
>     - DL: fit and fit multiple fail with memory exception in gpdb6
> (MADLIB-1405)
>     - DL: fit multiple takes up unnecessary disk space (MADLIB-1406)
>     - DL: Intermediate tables are not dropped  (MADLIB-1404)
>     - DL: MADlib Keras operations create too many threads (MADLIB-1372)
>     - DL: metrics_elapsed_time for fit multi_model not captured correctly
> (MADLIB-1403)
>     - DL: predict fails with OOM in gpdb6 (MADLIB-1414)
>     - DL: Remove final function for fit multiple (MADLIB-1416)
>     - DL: Support schema qualified output tables for fit and fit_multiple
> (MADLIB-1417)
>     - Graph: APSP fails if both vertex id column and edge src column has
> the same name (MADLIB-1407)
>     - Graph: ASPS Path Function fails if src or dest column type is bigint
> (MADLIB-1408)
>     - Graph: Graph/wcc fails if the user specifies a schema for the output
> table (MADLIB-1411)
>     - Kmeans: k-means related functions must use same default distance
> function (MADLIB-1383)
>     - LDA: Term frequency and LDA - turn off notices (MADLIB-1395)
>     - MADlib cannot be built on PowerPC machines with Linux (MADLIB-1410)
>     - Pivot:  Pivot documentation should say "out_table" instead of
> "output_table" (MADLIB-1376)
>
> Other:
>     - DL: Support up to Keras version 2.2.4, Tensorflow version 1.14
>     - DL: If 'madlib_keras_fit_multiple_model()' is running on GPDB 5 and
> some versions of GPDB 6, the database will keep adding to the disk space
> (in proportion to model size) and will only release the disk space once the
> fit multiple query has completed execution. This is not the case for GPDB
> 6.5.0+ where disk space is released during the fit multiple query.
>     - DL: CUDA GPU memory cannot be released until the process holding it
> is terminated.  This process holds the GPU memory until one of the
> following two things happen: query finishes and user logs out of the
> Postgres client/session; or, query finishes and user waits for the timeout
> set by `gp_vmem_idle_resource_timeout`. The default value for this timeout
> in Greenplum is 18 sec, but it can be changed.
>     - DL: pg_temp is not allowed as an output table schema for
> madlib_keras_fit_multiple_model().
>     - Build: Enable current versions of bison
>     - Build: Add cmake variable for gppkg filename
>     - Build: Add pull request template
>
> 1.17.0 docs available here:
> http://madlib.apache.org/docs/rc/index.html
>
> For additional information, please see:
> https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.17.0
>
> Here are the release artifact details:
>
> Source release tag to be voted on: rc/1.17.0-rc2, located here:
>
> https://git-wip-us.apache.org/repos/asf?p=madlib.git;a=tag;h=refs/tags/rc/1.17.0-rc2
>
> Source release tarball can be retrieved from the following locations:
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.asc
> SHA512 Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.sha512
>
> Convenience binary packages can be retrieved from the following
> locations:
>
> macOS: 10.14 GPDB 5.* & 6.*, PostgreSQL 11 & 12
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.asc
> SHA512 Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.sha512
>
> CentOS 5.* GPDB 4.3.5+ (compiled with gcc 4.4)
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.asc
> SHA512 Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.sha512
>
> CentOS 6 (tested on CentOS 7 as well), GPDB 5.* & 6.*, PostgreSQL 11 & 12
> (compiled with gcc 6.2)
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.asc
> SHA512 Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.sha512
>
> Ubuntu 18.04 GPDB 5.* & 6.*,PostgreSQL 11 & 12 (compiled with gcc 7.4)
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.asc
> SHA512 Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.sha512
>
> The PGP KEYS file used to validate the signature of the release artifacts
> is available here:
> https://dist.apache.org/repos/dist/dev/madlib/KEYS
>
> To help in tallying the vote, PMC members please be sure to indicate
> “(binding)” with the vote.
>
> [ ] +1 approve
> [ ] +0 no opinion
> [ ] -1 disapprove (and reason why)
>
> Best regards,
> Orhan Kislal <ok...@apache.org>
>

Re: [VOTE] MADlib v1.17.0-rc2

Posted by Frank McQuillan <fm...@pivotal.io>.
+1 (binding)

tested on pg11.3 on osx
- passed install check, dev check, spot check of some new 1.17 functions

tested on gp5.18
- passed spot check of some new 1.17 functions

well done!

On Mon, Apr 6, 2020 at 4:49 PM Orhan Kislal <ok...@pivotal.io> wrote:

> Hello Apache MADlib community,
>
> This is the vote for Apache MADlib 1.17.0 Release (RC2). It provides the
> source release tarball and convenience binaries.
>
> We didn't hold a vote for RC1 because we discovered a minor issue before
> sending the vote.
>
> The vote will run for at least 72 hours and will close on Thursday,
> April 9, 2020 @ 23:59 UTC (16:59 PDT). A minimum of 3 binding +1 votes
> and more binding +1 than binding -1 are required to pass.
>
> The main goals of this release are:
>
> New features
>     - DL: Add optional params to madlib_keras_fit_multiple_model
> (MADLIB-1397)
>     - DL: Fit and evaluate changes for asymmetric cluster config
> (MADLIB-1393)
>     - DL: Make param search fit() function work with existing evaluate and
> predict (MADLIB-1387)
>     - DL: ParamSearch: Add utility function for generating model selection
> table (MADLIB-1375)
>     - DL: Predict changes for asymmetric cluster config (MADLIB-1394)
>     - DL: Preprocessor should evenly distribute data on an arbitrary number
> of segments (MADLIB-1378)
>     - DL: Preprocessor support for asymmetric segment distribution
> (MADLIB-1392)
>     - DL: Remove model_arch_table column from the output of
> load_model_selection_table (MADLIB-1381)
>     - DL: Support DL predict without training on MADlib (MADLIB-1359)
>     - DL: Transfer learning for multi-model (MADLIB-1389)
>     - Kmeans: Add simple silhouette score for every point (MADLIB-1382)
>     - Kmeans: Select number of centroids in k-means (MADLIB-1380)
>     - PostgreSQL 12 support (MADLIB-1391)
>
> Improvements:
>     - Assoc rules: Add option to set number of posterior in association
> rules (MADLIB-1327)
>     - Correlation: Improve correlation and covariance memory usage with
> large number of groups (MADLIB-1301)
>     - DL: helper function for asymmetric cluster config (MADLIB-1390)
>     - DL: Mini-batch preprocessor for images - performance issue
> (MADLIB-1342)
>     - DL: Modify warm start logic for DL to handle case of missing weight
> (MADLIB-1400)
>     - DL: Param search for multiple models on MPP architecture
> (MADLIB-1386)
>     - DL: performance improvements to fit transition function (MADLIB-1418)
>     - Docs: Enhance Installation Guides (MADLIB-1399)
>     - Graph: SSSP should not show vertices in output table that are
> unreachable (MADLIB-1415)
>     - Knn - add zero check and output distance array (MADLIB-1370)
>     - LDA: Add stopping criteria on perplexity to LDA (MADLIB-1351)
>     - Summary: Last optional param in summary errors when NULL
> (MADLIB-1413)
>     - Summary: Summary function has dups for MFV for approximate results
> (MADLIB-1412)
>     - SVM: Change default num_components for SVM to max(100,
> 2*num_features) (MADLIB-1384)
>
> Bug fixes:
>     - DL: Deep Learning module does not work with tables in non-public
> schemas (MADLIB-1388)
>     - DL: Exception during madlib_keras_fit when model_arch_id is passed as
> NULL (MADLIB-1371)
>     - DL: fit and fit multiple fail with memory exception in gpdb6
> (MADLIB-1405)
>     - DL: fit multiple takes up unnecessary disk space (MADLIB-1406)
>     - DL: Intermediate tables are not dropped  (MADLIB-1404)
>     - DL: MADlib Keras operations create too many threads (MADLIB-1372)
>     - DL: metrics_elapsed_time for fit multi_model not captured correctly
> (MADLIB-1403)
>     - DL: predict fails with OOM in gpdb6 (MADLIB-1414)
>     - DL: Remove final function for fit multiple (MADLIB-1416)
>     - DL: Support schema qualified output tables for fit and fit_multiple
> (MADLIB-1417)
>     - Graph: APSP fails if both vertex id column and edge src column has
> the same name (MADLIB-1407)
>     - Graph: ASPS Path Function fails if src or dest column type is bigint
> (MADLIB-1408)
>     - Graph: Graph/wcc fails if the user specifies a schema for the output
> table (MADLIB-1411)
>     - Kmeans: k-means related functions must use same default distance
> function (MADLIB-1383)
>     - LDA: Term frequency and LDA - turn off notices (MADLIB-1395)
>     - MADlib cannot be built on PowerPC machines with Linux (MADLIB-1410)
>     - Pivot:  Pivot documentation should say "out_table" instead of
> "output_table" (MADLIB-1376)
>
> Other:
>     - DL: Support up to Keras version 2.2.4, Tensorflow version 1.14
>     - DL: If 'madlib_keras_fit_multiple_model()' is running on GPDB 5 and
> some versions of GPDB 6, the database will keep adding to the disk space
> (in proportion to model size) and will only release the disk space once the
> fit multiple query has completed execution. This is not the case for GPDB
> 6.5.0+ where disk space is released during the fit multiple query.
>     - DL: CUDA GPU memory cannot be released until the process holding it
> is terminated.  This process holds the GPU memory until one of the
> following two things happen: query finishes and user logs out of the
> Postgres client/session; or, query finishes and user waits for the timeout
> set by `gp_vmem_idle_resource_timeout`. The default value for this timeout
> in Greenplum is 18 sec, but it can be changed.
>     - DL: pg_temp is not allowed as an output table schema for
> madlib_keras_fit_multiple_model().
>     - Build: Enable current versions of bison
>     - Build: Add cmake variable for gppkg filename
>     - Build: Add pull request template
>
> 1.17.0 docs available here:
> http://madlib.apache.org/docs/rc/index.html
>
> For additional information, please see:
> https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.17.0
>
> Here are the release artifact details:
>
> Source release tag to be voted on: rc/1.17.0-rc2, located here:
>
> https://git-wip-us.apache.org/repos/asf?p=madlib.git;a=tag;h=refs/tags/rc/1.17.0-rc2
>
> Source release tarball can be retrieved from the following locations:
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.asc
> SHA512
> <https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.ascSHA512>
> Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.sha512
>
> Convenience binary packages can be retrieved from the following
> locations:
>
> macOS: 10.14 GPDB 5.* & 6.*, PostgreSQL 11 & 12
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.asc
> SHA512
> <https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.ascSHA512>
> Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.sha512
>
> CentOS 5.* GPDB 4.3.5+ (compiled with gcc 4.4)
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.asc
> SHA512
> <https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.ascSHA512>
> Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.sha512
>
> CentOS 6 (tested on CentOS 7 as well), GPDB 5.* & 6.*, PostgreSQL 11 & 12
> (compiled with gcc 6.2)
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.asc
> SHA512
> <https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.ascSHA512>
> Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.sha512
>
> Ubuntu 18.04 GPDB 5.* & 6.*,PostgreSQL 11 & 12 (compiled with gcc 7.4)
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.asc
> SHA512
> <https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.ascSHA512>
> Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.sha512
>
> The PGP KEYS file used to validate the signature of the release artifacts
> is available here:
> https://dist.apache.org/repos/dist/dev/madlib/KEYS
>
> To help in tallying the vote, PMC members please be sure to indicate
> “(binding)” with the vote.
>
> [ ] +1 approve
> [ ] +0 no opinion
> [ ] -1 disapprove (and reason why)
>
> Best regards,
> Orhan Kislal <ok...@apache.org>
>

Re: [VOTE] MADlib v1.17.0-rc2

Posted by Xiaocheng Tang <xi...@gmail.com>.
+1

Xiaocheng

________________________________
From: Orhan Kislal <ok...@pivotal.io>
Sent: Monday, April 6, 2020 4:48:58 PM
To: dev@madlib.apache.org <de...@madlib.apache.org>; user@madlib.apache.org <us...@madlib.apache.org>
Subject: [VOTE] MADlib v1.17.0-rc2

Hello Apache MADlib community,

This is the vote for Apache MADlib 1.17.0 Release (RC2). It provides the
source release tarball and convenience binaries.

We didn't hold a vote for RC1 because we discovered a minor issue before
sending the vote.

The vote will run for at least 72 hours and will close on Thursday,
April 9, 2020 @ 23:59 UTC (16:59 PDT). A minimum of 3 binding +1 votes
and more binding +1 than binding -1 are required to pass.

The main goals of this release are:

New features
    - DL: Add optional params to madlib_keras_fit_multiple_model (MADLIB-1397)
    - DL: Fit and evaluate changes for asymmetric cluster config (MADLIB-1393)
    - DL: Make param search fit() function work with existing evaluate and predict (MADLIB-1387)
    - DL: ParamSearch: Add utility function for generating model selection table (MADLIB-1375)
    - DL: Predict changes for asymmetric cluster config (MADLIB-1394)
    - DL: Preprocessor should evenly distribute data on an arbitrary number of segments (MADLIB-1378)
    - DL: Preprocessor support for asymmetric segment distribution (MADLIB-1392)
    - DL: Remove model_arch_table column from the output of load_model_selection_table (MADLIB-1381)
    - DL: Support DL predict without training on MADlib (MADLIB-1359)
    - DL: Transfer learning for multi-model (MADLIB-1389)
    - Kmeans: Add simple silhouette score for every point (MADLIB-1382)
    - Kmeans: Select number of centroids in k-means (MADLIB-1380)
    - PostgreSQL 12 support (MADLIB-1391)

Improvements:
    - Assoc rules: Add option to set number of posterior in association rules (MADLIB-1327)
    - Correlation: Improve correlation and covariance memory usage with large number of groups (MADLIB-1301)
    - DL: helper function for asymmetric cluster config (MADLIB-1390)
    - DL: Mini-batch preprocessor for images - performance issue (MADLIB-1342)
    - DL: Modify warm start logic for DL to handle case of missing weight (MADLIB-1400)
    - DL: Param search for multiple models on MPP architecture (MADLIB-1386)
    - DL: performance improvements to fit transition function (MADLIB-1418)
    - Docs: Enhance Installation Guides (MADLIB-1399)
    - Graph: SSSP should not show vertices in output table that are unreachable (MADLIB-1415)
    - Knn - add zero check and output distance array (MADLIB-1370)
    - LDA: Add stopping criteria on perplexity to LDA (MADLIB-1351)
    - Summary: Last optional param in summary errors when NULL (MADLIB-1413)
    - Summary: Summary function has dups for MFV for approximate results (MADLIB-1412)
    - SVM: Change default num_components for SVM to max(100, 2*num_features) (MADLIB-1384)

Bug fixes:
    - DL: Deep Learning module does not work with tables in non-public schemas (MADLIB-1388)
    - DL: Exception during madlib_keras_fit when model_arch_id is passed as NULL (MADLIB-1371)
    - DL: fit and fit multiple fail with memory exception in gpdb6 (MADLIB-1405)
    - DL: fit multiple takes up unnecessary disk space (MADLIB-1406)
    - DL: Intermediate tables are not dropped  (MADLIB-1404)
    - DL: MADlib Keras operations create too many threads (MADLIB-1372)
    - DL: metrics_elapsed_time for fit multi_model not captured correctly (MADLIB-1403)
    - DL: predict fails with OOM in gpdb6 (MADLIB-1414)
    - DL: Remove final function for fit multiple (MADLIB-1416)
    - DL: Support schema qualified output tables for fit and fit_multiple (MADLIB-1417)
    - Graph: APSP fails if both vertex id column and edge src column has the same name (MADLIB-1407)
    - Graph: ASPS Path Function fails if src or dest column type is bigint (MADLIB-1408)
    - Graph: Graph/wcc fails if the user specifies a schema for the output table (MADLIB-1411)
    - Kmeans: k-means related functions must use same default distance function (MADLIB-1383)
    - LDA: Term frequency and LDA - turn off notices (MADLIB-1395)
    - MADlib cannot be built on PowerPC machines with Linux (MADLIB-1410)
    - Pivot:  Pivot documentation should say "out_table" instead of "output_table" (MADLIB-1376)

Other:
    - DL: Support up to Keras version 2.2.4, Tensorflow version 1.14
    - DL: If 'madlib_keras_fit_multiple_model()' is running on GPDB 5 and some versions of GPDB 6, the database will keep adding to the disk space (in proportion to model size) and will only release the disk space once the fit multiple query has completed execution. This is not the case for GPDB 6.5.0+ where disk space is released during the fit multiple query.
    - DL: CUDA GPU memory cannot be released until the process holding it is terminated.  This process holds the GPU memory until one of the following two things happen: query finishes and user logs out of the Postgres client/session; or, query finishes and user waits for the timeout set by `gp_vmem_idle_resource_timeout`. The default value for this timeout in Greenplum is 18 sec, but it can be changed.
    - DL: pg_temp is not allowed as an output table schema for madlib_keras_fit_multiple_model().
    - Build: Enable current versions of bison
    - Build: Add cmake variable for gppkg filename
    - Build: Add pull request template

1.17.0 docs available here:
http://madlib.apache.org/docs/rc/index.html

For additional information, please see:
https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.17.0

Here are the release artifact details:

Source release tag to be voted on: rc/1.17.0-rc2, located here:
https://git-wip-us.apache.org/repos/asf?p=madlib.git;a=tag;h=refs/tags/rc/1.17.0-rc2

Source release tarball can be retrieved from the following locations:
Package:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz
PGP Signature:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.asc
SHA512 Hash:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.sha512

Convenience binary packages can be retrieved from the following
locations:

macOS: 10.14 GPDB 5.* & 6.*, PostgreSQL 11 & 12

Package:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg
PGP Signature:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.asc
SHA512 Hash:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.sha512

CentOS 5.* GPDB 4.3.5+ (compiled with gcc 4.4)

Package:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm
PGP Signature:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.asc
SHA512 Hash:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.sha512

CentOS 6 (tested on CentOS 7 as well), GPDB 5.* & 6.*, PostgreSQL 11 & 12 (compiled with gcc 6.2)

Package:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm
PGP Signature:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.asc
SHA512 Hash:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.sha512

Ubuntu 18.04 GPDB 5.* & 6.*,PostgreSQL 11 & 12 (compiled with gcc 7.4)

Package:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb
PGP Signature:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.asc
SHA512 Hash:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.sha512

The PGP KEYS file used to validate the signature of the release artifacts
is available here:
https://dist.apache.org/repos/dist/dev/madlib/KEYS

To help in tallying the vote, PMC members please be sure to indicate
“(binding)” with the vote.

[ ] +1 approve
[ ] +0 no opinion
[ ] -1 disapprove (and reason why)

Best regards,
Orhan Kislal <ok...@apache.org>>

Re: [VOTE] MADlib v1.17.0-rc2

Posted by Domino Valdano <dv...@pivotal.io>.
Tested on OSX 10.14.6 (Mojave) with gpdb5 assert build

Installs and passes most of dev-check, aside from a couple known failures
due to asserts.

 PostgreSQL 8.3.23 (Greenplum Database 5.23.0+dev.10.g50015fed64 build dev)
on x86_64-apple-darwin18.7.0, compiled by GCC Apple LLVM version 10.0.1
(clang-1001.0.46.4), 64-bit compiled on Nov  5 2019 08:49:26 (with assert
checking)

+1  (non-binding)

Domino

On Mon, Apr 6, 2020 at 4:49 PM Orhan Kislal <ok...@pivotal.io> wrote:

> Hello Apache MADlib community,
>
> This is the vote for Apache MADlib 1.17.0 Release (RC2). It provides the
> source release tarball and convenience binaries.
>
> We didn't hold a vote for RC1 because we discovered a minor issue before
> sending the vote.
>
> The vote will run for at least 72 hours and will close on Thursday,
> April 9, 2020 @ 23:59 UTC (16:59 PDT). A minimum of 3 binding +1 votes
> and more binding +1 than binding -1 are required to pass.
>
> The main goals of this release are:
>
> New features
>     - DL: Add optional params to madlib_keras_fit_multiple_model
> (MADLIB-1397)
>     - DL: Fit and evaluate changes for asymmetric cluster config
> (MADLIB-1393)
>     - DL: Make param search fit() function work with existing evaluate and
> predict (MADLIB-1387)
>     - DL: ParamSearch: Add utility function for generating model selection
> table (MADLIB-1375)
>     - DL: Predict changes for asymmetric cluster config (MADLIB-1394)
>     - DL: Preprocessor should evenly distribute data on an arbitrary
> number of segments (MADLIB-1378)
>     - DL: Preprocessor support for asymmetric segment distribution
> (MADLIB-1392)
>     - DL: Remove model_arch_table column from the output of
> load_model_selection_table (MADLIB-1381)
>     - DL: Support DL predict without training on MADlib (MADLIB-1359)
>     - DL: Transfer learning for multi-model (MADLIB-1389)
>     - Kmeans: Add simple silhouette score for every point (MADLIB-1382)
>     - Kmeans: Select number of centroids in k-means (MADLIB-1380)
>     - PostgreSQL 12 support (MADLIB-1391)
>
> Improvements:
>     - Assoc rules: Add option to set number of posterior in association
> rules (MADLIB-1327)
>     - Correlation: Improve correlation and covariance memory usage with
> large number of groups (MADLIB-1301)
>     - DL: helper function for asymmetric cluster config (MADLIB-1390)
>     - DL: Mini-batch preprocessor for images - performance issue
> (MADLIB-1342)
>     - DL: Modify warm start logic for DL to handle case of missing weight
> (MADLIB-1400)
>     - DL: Param search for multiple models on MPP architecture
> (MADLIB-1386)
>     - DL: performance improvements to fit transition function (MADLIB-1418)
>     - Docs: Enhance Installation Guides (MADLIB-1399)
>     - Graph: SSSP should not show vertices in output table that are
> unreachable (MADLIB-1415)
>     - Knn - add zero check and output distance array (MADLIB-1370)
>     - LDA: Add stopping criteria on perplexity to LDA (MADLIB-1351)
>     - Summary: Last optional param in summary errors when NULL
> (MADLIB-1413)
>     - Summary: Summary function has dups for MFV for approximate results
> (MADLIB-1412)
>     - SVM: Change default num_components for SVM to max(100,
> 2*num_features) (MADLIB-1384)
>
> Bug fixes:
>     - DL: Deep Learning module does not work with tables in non-public
> schemas (MADLIB-1388)
>     - DL: Exception during madlib_keras_fit when model_arch_id is passed
> as NULL (MADLIB-1371)
>     - DL: fit and fit multiple fail with memory exception in gpdb6
> (MADLIB-1405)
>     - DL: fit multiple takes up unnecessary disk space (MADLIB-1406)
>     - DL: Intermediate tables are not dropped  (MADLIB-1404)
>     - DL: MADlib Keras operations create too many threads (MADLIB-1372)
>     - DL: metrics_elapsed_time for fit multi_model not captured correctly
> (MADLIB-1403)
>     - DL: predict fails with OOM in gpdb6 (MADLIB-1414)
>     - DL: Remove final function for fit multiple (MADLIB-1416)
>     - DL: Support schema qualified output tables for fit and fit_multiple
> (MADLIB-1417)
>     - Graph: APSP fails if both vertex id column and edge src column has
> the same name (MADLIB-1407)
>     - Graph: ASPS Path Function fails if src or dest column type is bigint
> (MADLIB-1408)
>     - Graph: Graph/wcc fails if the user specifies a schema for the output
> table (MADLIB-1411)
>     - Kmeans: k-means related functions must use same default distance
> function (MADLIB-1383)
>     - LDA: Term frequency and LDA - turn off notices (MADLIB-1395)
>     - MADlib cannot be built on PowerPC machines with Linux (MADLIB-1410)
>     - Pivot:  Pivot documentation should say "out_table" instead of
> "output_table" (MADLIB-1376)
>
> Other:
>     - DL: Support up to Keras version 2.2.4, Tensorflow version 1.14
>     - DL: If 'madlib_keras_fit_multiple_model()' is running on GPDB 5 and
> some versions of GPDB 6, the database will keep adding to the disk space
> (in proportion to model size) and will only release the disk space once the
> fit multiple query has completed execution. This is not the case for GPDB
> 6.5.0+ where disk space is released during the fit multiple query.
>     - DL: CUDA GPU memory cannot be released until the process holding it
> is terminated.  This process holds the GPU memory until one of the
> following two things happen: query finishes and user logs out of the
> Postgres client/session; or, query finishes and user waits for the timeout
> set by `gp_vmem_idle_resource_timeout`. The default value for this timeout
> in Greenplum is 18 sec, but it can be changed.
>     - DL: pg_temp is not allowed as an output table schema for
> madlib_keras_fit_multiple_model().
>     - Build: Enable current versions of bison
>     - Build: Add cmake variable for gppkg filename
>     - Build: Add pull request template
>
> 1.17.0 docs available here:
> http://madlib.apache.org/docs/rc/index.html
>
> For additional information, please see:
> https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.17.0
>
> Here are the release artifact details:
>
> Source release tag to be voted on: rc/1.17.0-rc2, located here:
>
> https://git-wip-us.apache.org/repos/asf?p=madlib.git;a=tag;h=refs/tags/rc/1.17.0-rc2
>
> Source release tarball can be retrieved from the following locations:
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.asc
> SHA512 Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.sha512
>
> Convenience binary packages can be retrieved from the following
> locations:
>
> macOS: 10.14 GPDB 5.* & 6.*, PostgreSQL 11 & 12
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.asc
> SHA512 Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.sha512
>
> CentOS 5.* GPDB 4.3.5+ (compiled with gcc 4.4)
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.asc
> SHA512 Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.sha512
>
> CentOS 6 (tested on CentOS 7 as well), GPDB 5.* & 6.*, PostgreSQL 11 & 12
> (compiled with gcc 6.2)
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.asc
> SHA512 Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.sha512
>
> Ubuntu 18.04 GPDB 5.* & 6.*,PostgreSQL 11 & 12 (compiled with gcc 7.4)
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.asc
> SHA512 Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.sha512
>
> The PGP KEYS file used to validate the signature of the release artifacts
> is available here:
> https://dist.apache.org/repos/dist/dev/madlib/KEYS
>
> To help in tallying the vote, PMC members please be sure to indicate
> “(binding)” with the vote.
>
> [ ] +1 approve
> [ ] +0 no opinion
> [ ] -1 disapprove (and reason why)
>
> Best regards,
> Orhan Kislal <ok...@apache.org>
>


-- 
Domino Valdano <dv...@vmware.com>
Pronouns:  She/Her
VMware Staff Software Engineer
Modern Applications Platform

Re: [VOTE] MADlib v1.17.0-rc2

Posted by Xiaocheng Tang <xi...@gmail.com>.
+1

Xiaocheng

________________________________
From: Orhan Kislal <ok...@pivotal.io>
Sent: Monday, April 6, 2020 4:48:58 PM
To: dev@madlib.apache.org <de...@madlib.apache.org>; user@madlib.apache.org <us...@madlib.apache.org>
Subject: [VOTE] MADlib v1.17.0-rc2

Hello Apache MADlib community,

This is the vote for Apache MADlib 1.17.0 Release (RC2). It provides the
source release tarball and convenience binaries.

We didn't hold a vote for RC1 because we discovered a minor issue before
sending the vote.

The vote will run for at least 72 hours and will close on Thursday,
April 9, 2020 @ 23:59 UTC (16:59 PDT). A minimum of 3 binding +1 votes
and more binding +1 than binding -1 are required to pass.

The main goals of this release are:

New features
    - DL: Add optional params to madlib_keras_fit_multiple_model (MADLIB-1397)
    - DL: Fit and evaluate changes for asymmetric cluster config (MADLIB-1393)
    - DL: Make param search fit() function work with existing evaluate and predict (MADLIB-1387)
    - DL: ParamSearch: Add utility function for generating model selection table (MADLIB-1375)
    - DL: Predict changes for asymmetric cluster config (MADLIB-1394)
    - DL: Preprocessor should evenly distribute data on an arbitrary number of segments (MADLIB-1378)
    - DL: Preprocessor support for asymmetric segment distribution (MADLIB-1392)
    - DL: Remove model_arch_table column from the output of load_model_selection_table (MADLIB-1381)
    - DL: Support DL predict without training on MADlib (MADLIB-1359)
    - DL: Transfer learning for multi-model (MADLIB-1389)
    - Kmeans: Add simple silhouette score for every point (MADLIB-1382)
    - Kmeans: Select number of centroids in k-means (MADLIB-1380)
    - PostgreSQL 12 support (MADLIB-1391)

Improvements:
    - Assoc rules: Add option to set number of posterior in association rules (MADLIB-1327)
    - Correlation: Improve correlation and covariance memory usage with large number of groups (MADLIB-1301)
    - DL: helper function for asymmetric cluster config (MADLIB-1390)
    - DL: Mini-batch preprocessor for images - performance issue (MADLIB-1342)
    - DL: Modify warm start logic for DL to handle case of missing weight (MADLIB-1400)
    - DL: Param search for multiple models on MPP architecture (MADLIB-1386)
    - DL: performance improvements to fit transition function (MADLIB-1418)
    - Docs: Enhance Installation Guides (MADLIB-1399)
    - Graph: SSSP should not show vertices in output table that are unreachable (MADLIB-1415)
    - Knn - add zero check and output distance array (MADLIB-1370)
    - LDA: Add stopping criteria on perplexity to LDA (MADLIB-1351)
    - Summary: Last optional param in summary errors when NULL (MADLIB-1413)
    - Summary: Summary function has dups for MFV for approximate results (MADLIB-1412)
    - SVM: Change default num_components for SVM to max(100, 2*num_features) (MADLIB-1384)

Bug fixes:
    - DL: Deep Learning module does not work with tables in non-public schemas (MADLIB-1388)
    - DL: Exception during madlib_keras_fit when model_arch_id is passed as NULL (MADLIB-1371)
    - DL: fit and fit multiple fail with memory exception in gpdb6 (MADLIB-1405)
    - DL: fit multiple takes up unnecessary disk space (MADLIB-1406)
    - DL: Intermediate tables are not dropped  (MADLIB-1404)
    - DL: MADlib Keras operations create too many threads (MADLIB-1372)
    - DL: metrics_elapsed_time for fit multi_model not captured correctly (MADLIB-1403)
    - DL: predict fails with OOM in gpdb6 (MADLIB-1414)
    - DL: Remove final function for fit multiple (MADLIB-1416)
    - DL: Support schema qualified output tables for fit and fit_multiple (MADLIB-1417)
    - Graph: APSP fails if both vertex id column and edge src column has the same name (MADLIB-1407)
    - Graph: ASPS Path Function fails if src or dest column type is bigint (MADLIB-1408)
    - Graph: Graph/wcc fails if the user specifies a schema for the output table (MADLIB-1411)
    - Kmeans: k-means related functions must use same default distance function (MADLIB-1383)
    - LDA: Term frequency and LDA - turn off notices (MADLIB-1395)
    - MADlib cannot be built on PowerPC machines with Linux (MADLIB-1410)
    - Pivot:  Pivot documentation should say "out_table" instead of "output_table" (MADLIB-1376)

Other:
    - DL: Support up to Keras version 2.2.4, Tensorflow version 1.14
    - DL: If 'madlib_keras_fit_multiple_model()' is running on GPDB 5 and some versions of GPDB 6, the database will keep adding to the disk space (in proportion to model size) and will only release the disk space once the fit multiple query has completed execution. This is not the case for GPDB 6.5.0+ where disk space is released during the fit multiple query.
    - DL: CUDA GPU memory cannot be released until the process holding it is terminated.  This process holds the GPU memory until one of the following two things happen: query finishes and user logs out of the Postgres client/session; or, query finishes and user waits for the timeout set by `gp_vmem_idle_resource_timeout`. The default value for this timeout in Greenplum is 18 sec, but it can be changed.
    - DL: pg_temp is not allowed as an output table schema for madlib_keras_fit_multiple_model().
    - Build: Enable current versions of bison
    - Build: Add cmake variable for gppkg filename
    - Build: Add pull request template

1.17.0 docs available here:
http://madlib.apache.org/docs/rc/index.html

For additional information, please see:
https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.17.0

Here are the release artifact details:

Source release tag to be voted on: rc/1.17.0-rc2, located here:
https://git-wip-us.apache.org/repos/asf?p=madlib.git;a=tag;h=refs/tags/rc/1.17.0-rc2

Source release tarball can be retrieved from the following locations:
Package:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz
PGP Signature:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.asc
SHA512 Hash:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.sha512

Convenience binary packages can be retrieved from the following
locations:

macOS: 10.14 GPDB 5.* & 6.*, PostgreSQL 11 & 12

Package:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg
PGP Signature:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.asc
SHA512 Hash:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.sha512

CentOS 5.* GPDB 4.3.5+ (compiled with gcc 4.4)

Package:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm
PGP Signature:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.asc
SHA512 Hash:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.sha512

CentOS 6 (tested on CentOS 7 as well), GPDB 5.* & 6.*, PostgreSQL 11 & 12 (compiled with gcc 6.2)

Package:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm
PGP Signature:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.asc
SHA512 Hash:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.sha512

Ubuntu 18.04 GPDB 5.* & 6.*,PostgreSQL 11 & 12 (compiled with gcc 7.4)

Package:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb
PGP Signature:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.asc
SHA512 Hash:
https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.sha512

The PGP KEYS file used to validate the signature of the release artifacts
is available here:
https://dist.apache.org/repos/dist/dev/madlib/KEYS

To help in tallying the vote, PMC members please be sure to indicate
“(binding)” with the vote.

[ ] +1 approve
[ ] +0 no opinion
[ ] -1 disapprove (and reason why)

Best regards,
Orhan Kislal <ok...@apache.org>>

Re: [VOTE] MADlib v1.17.0-rc2

Posted by Nandish Jayaram <j....@gmail.com>.
[+1] binding.

NJ

>
>
> ---------- Forwarded message ---------
> From: Orhan Kislal <ok...@pivotal.io>
> Date: Mon, Apr 6, 2020 at 4:49 PM
> Subject: [VOTE] MADlib v1.17.0-rc2
> To: <de...@madlib.apache.org>, <us...@madlib.apache.org>
>
>
> Hello Apache MADlib community,
>
> This is the vote for Apache MADlib 1.17.0 Release (RC2). It provides the
> source release tarball and convenience binaries.
>
> We didn't hold a vote for RC1 because we discovered a minor issue before
> sending the vote.
>
> The vote will run for at least 72 hours and will close on Thursday,
> April 9, 2020 @ 23:59 UTC (16:59 PDT). A minimum of 3 binding +1 votes
> and more binding +1 than binding -1 are required to pass.
>
> The main goals of this release are:
>
> New features
>     - DL: Add optional params to madlib_keras_fit_multiple_model
> (MADLIB-1397)
>     - DL: Fit and evaluate changes for asymmetric cluster config
> (MADLIB-1393)
>     - DL: Make param search fit() function work with existing evaluate and
> predict (MADLIB-1387)
>     - DL: ParamSearch: Add utility function for generating model selection
> table (MADLIB-1375)
>     - DL: Predict changes for asymmetric cluster config (MADLIB-1394)
>     - DL: Preprocessor should evenly distribute data on an arbitrary number
> of segments (MADLIB-1378)
>     - DL: Preprocessor support for asymmetric segment distribution
> (MADLIB-1392)
>     - DL: Remove model_arch_table column from the output of
> load_model_selection_table (MADLIB-1381)
>     - DL: Support DL predict without training on MADlib (MADLIB-1359)
>     - DL: Transfer learning for multi-model (MADLIB-1389)
>     - Kmeans: Add simple silhouette score for every point (MADLIB-1382)
>     - Kmeans: Select number of centroids in k-means (MADLIB-1380)
>     - PostgreSQL 12 support (MADLIB-1391)
>
> Improvements:
>     - Assoc rules: Add option to set number of posterior in association
> rules (MADLIB-1327)
>     - Correlation: Improve correlation and covariance memory usage with
> large number of groups (MADLIB-1301)
>     - DL: helper function for asymmetric cluster config (MADLIB-1390)
>     - DL: Mini-batch preprocessor for images - performance issue
> (MADLIB-1342)
>     - DL: Modify warm start logic for DL to handle case of missing weight
> (MADLIB-1400)
>     - DL: Param search for multiple models on MPP architecture
> (MADLIB-1386)
>     - DL: performance improvements to fit transition function (MADLIB-1418)
>     - Docs: Enhance Installation Guides (MADLIB-1399)
>     - Graph: SSSP should not show vertices in output table that are
> unreachable (MADLIB-1415)
>     - Knn - add zero check and output distance array (MADLIB-1370)
>     - LDA: Add stopping criteria on perplexity to LDA (MADLIB-1351)
>     - Summary: Last optional param in summary errors when NULL
> (MADLIB-1413)
>     - Summary: Summary function has dups for MFV for approximate results
> (MADLIB-1412)
>     - SVM: Change default num_components for SVM to max(100,
> 2*num_features) (MADLIB-1384)
>
> Bug fixes:
>     - DL: Deep Learning module does not work with tables in non-public
> schemas (MADLIB-1388)
>     - DL: Exception during madlib_keras_fit when model_arch_id is passed as
> NULL (MADLIB-1371)
>     - DL: fit and fit multiple fail with memory exception in gpdb6
> (MADLIB-1405)
>     - DL: fit multiple takes up unnecessary disk space (MADLIB-1406)
>     - DL: Intermediate tables are not dropped  (MADLIB-1404)
>     - DL: MADlib Keras operations create too many threads (MADLIB-1372)
>     - DL: metrics_elapsed_time for fit multi_model not captured correctly
> (MADLIB-1403)
>     - DL: predict fails with OOM in gpdb6 (MADLIB-1414)
>     - DL: Remove final function for fit multiple (MADLIB-1416)
>     - DL: Support schema qualified output tables for fit and fit_multiple
> (MADLIB-1417)
>     - Graph: APSP fails if both vertex id column and edge src column has
> the same name (MADLIB-1407)
>     - Graph: ASPS Path Function fails if src or dest column type is bigint
> (MADLIB-1408)
>     - Graph: Graph/wcc fails if the user specifies a schema for the output
> table (MADLIB-1411)
>     - Kmeans: k-means related functions must use same default distance
> function (MADLIB-1383)
>     - LDA: Term frequency and LDA - turn off notices (MADLIB-1395)
>     - MADlib cannot be built on PowerPC machines with Linux (MADLIB-1410)
>     - Pivot:  Pivot documentation should say "out_table" instead of
> "output_table" (MADLIB-1376)
>
> Other:
>     - DL: Support up to Keras version 2.2.4, Tensorflow version 1.14
>     - DL: If 'madlib_keras_fit_multiple_model()' is running on GPDB 5 and
> some versions of GPDB 6, the database will keep adding to the disk space
> (in proportion to model size) and will only release the disk space once the
> fit multiple query has completed execution. This is not the case for GPDB
> 6.5.0+ where disk space is released during the fit multiple query.
>     - DL: CUDA GPU memory cannot be released until the process holding it
> is terminated.  This process holds the GPU memory until one of the
> following two things happen: query finishes and user logs out of the
> Postgres client/session; or, query finishes and user waits for the timeout
> set by `gp_vmem_idle_resource_timeout`. The default value for this timeout
> in Greenplum is 18 sec, but it can be changed.
>     - DL: pg_temp is not allowed as an output table schema for
> madlib_keras_fit_multiple_model().
>     - Build: Enable current versions of bison
>     - Build: Add cmake variable for gppkg filename
>     - Build: Add pull request template
>
> 1.17.0 docs available here:
> http://madlib.apache.org/docs/rc/index.html
>
> For additional information, please see:
> https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.17.0
>
> Here are the release artifact details:
>
> Source release tag to be voted on: rc/1.17.0-rc2, located here:
>
> https://git-wip-us.apache.org/repos/asf?p=madlib.git;a=tag;h=refs/tags/rc/1.17.0-rc2
>
> Source release tarball can be retrieved from the following locations:
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.asc
> SHA512
> <https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.ascSHA512>
> Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.sha512
>
> Convenience binary packages can be retrieved from the following
> locations:
>
> macOS: 10.14 GPDB 5.* & 6.*, PostgreSQL 11 & 12
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.asc
> SHA512
> <https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.ascSHA512>
> Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.sha512
>
> CentOS 5.* GPDB 4.3.5+ (compiled with gcc 4.4)
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.asc
> SHA512
> <https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.ascSHA512>
> Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.sha512
>
> CentOS 6 (tested on CentOS 7 as well), GPDB 5.* & 6.*, PostgreSQL 11 & 12
> (compiled with gcc 6.2)
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.asc
> SHA512
> <https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.ascSHA512>
> Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.sha512
>
> Ubuntu 18.04 GPDB 5.* & 6.*,PostgreSQL 11 & 12 (compiled with gcc 7.4)
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.asc
> SHA512
> <https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.ascSHA512>
> Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.sha512
>
> The PGP KEYS file used to validate the signature of the release artifacts
> is available here:
> https://dist.apache.org/repos/dist/dev/madlib/KEYS
>
> To help in tallying the vote, PMC members please be sure to indicate
> “(binding)” with the vote.
>
> [ ] +1 approve
> [ ] +0 no opinion
> [ ] -1 disapprove (and reason why)
>
> Best regards,
> Orhan Kislal <ok...@apache.org>
>

Re: [VOTE] MADlib v1.17.0-rc2

Posted by Nandish Jayaram <j....@gmail.com>.
[+1] binding.

NJ

>
>
> ---------- Forwarded message ---------
> From: Orhan Kislal <ok...@pivotal.io>
> Date: Mon, Apr 6, 2020 at 4:49 PM
> Subject: [VOTE] MADlib v1.17.0-rc2
> To: <de...@madlib.apache.org>, <us...@madlib.apache.org>
>
>
> Hello Apache MADlib community,
>
> This is the vote for Apache MADlib 1.17.0 Release (RC2). It provides the
> source release tarball and convenience binaries.
>
> We didn't hold a vote for RC1 because we discovered a minor issue before
> sending the vote.
>
> The vote will run for at least 72 hours and will close on Thursday,
> April 9, 2020 @ 23:59 UTC (16:59 PDT). A minimum of 3 binding +1 votes
> and more binding +1 than binding -1 are required to pass.
>
> The main goals of this release are:
>
> New features
>     - DL: Add optional params to madlib_keras_fit_multiple_model
> (MADLIB-1397)
>     - DL: Fit and evaluate changes for asymmetric cluster config
> (MADLIB-1393)
>     - DL: Make param search fit() function work with existing evaluate and
> predict (MADLIB-1387)
>     - DL: ParamSearch: Add utility function for generating model selection
> table (MADLIB-1375)
>     - DL: Predict changes for asymmetric cluster config (MADLIB-1394)
>     - DL: Preprocessor should evenly distribute data on an arbitrary number
> of segments (MADLIB-1378)
>     - DL: Preprocessor support for asymmetric segment distribution
> (MADLIB-1392)
>     - DL: Remove model_arch_table column from the output of
> load_model_selection_table (MADLIB-1381)
>     - DL: Support DL predict without training on MADlib (MADLIB-1359)
>     - DL: Transfer learning for multi-model (MADLIB-1389)
>     - Kmeans: Add simple silhouette score for every point (MADLIB-1382)
>     - Kmeans: Select number of centroids in k-means (MADLIB-1380)
>     - PostgreSQL 12 support (MADLIB-1391)
>
> Improvements:
>     - Assoc rules: Add option to set number of posterior in association
> rules (MADLIB-1327)
>     - Correlation: Improve correlation and covariance memory usage with
> large number of groups (MADLIB-1301)
>     - DL: helper function for asymmetric cluster config (MADLIB-1390)
>     - DL: Mini-batch preprocessor for images - performance issue
> (MADLIB-1342)
>     - DL: Modify warm start logic for DL to handle case of missing weight
> (MADLIB-1400)
>     - DL: Param search for multiple models on MPP architecture
> (MADLIB-1386)
>     - DL: performance improvements to fit transition function (MADLIB-1418)
>     - Docs: Enhance Installation Guides (MADLIB-1399)
>     - Graph: SSSP should not show vertices in output table that are
> unreachable (MADLIB-1415)
>     - Knn - add zero check and output distance array (MADLIB-1370)
>     - LDA: Add stopping criteria on perplexity to LDA (MADLIB-1351)
>     - Summary: Last optional param in summary errors when NULL
> (MADLIB-1413)
>     - Summary: Summary function has dups for MFV for approximate results
> (MADLIB-1412)
>     - SVM: Change default num_components for SVM to max(100,
> 2*num_features) (MADLIB-1384)
>
> Bug fixes:
>     - DL: Deep Learning module does not work with tables in non-public
> schemas (MADLIB-1388)
>     - DL: Exception during madlib_keras_fit when model_arch_id is passed as
> NULL (MADLIB-1371)
>     - DL: fit and fit multiple fail with memory exception in gpdb6
> (MADLIB-1405)
>     - DL: fit multiple takes up unnecessary disk space (MADLIB-1406)
>     - DL: Intermediate tables are not dropped  (MADLIB-1404)
>     - DL: MADlib Keras operations create too many threads (MADLIB-1372)
>     - DL: metrics_elapsed_time for fit multi_model not captured correctly
> (MADLIB-1403)
>     - DL: predict fails with OOM in gpdb6 (MADLIB-1414)
>     - DL: Remove final function for fit multiple (MADLIB-1416)
>     - DL: Support schema qualified output tables for fit and fit_multiple
> (MADLIB-1417)
>     - Graph: APSP fails if both vertex id column and edge src column has
> the same name (MADLIB-1407)
>     - Graph: ASPS Path Function fails if src or dest column type is bigint
> (MADLIB-1408)
>     - Graph: Graph/wcc fails if the user specifies a schema for the output
> table (MADLIB-1411)
>     - Kmeans: k-means related functions must use same default distance
> function (MADLIB-1383)
>     - LDA: Term frequency and LDA - turn off notices (MADLIB-1395)
>     - MADlib cannot be built on PowerPC machines with Linux (MADLIB-1410)
>     - Pivot:  Pivot documentation should say "out_table" instead of
> "output_table" (MADLIB-1376)
>
> Other:
>     - DL: Support up to Keras version 2.2.4, Tensorflow version 1.14
>     - DL: If 'madlib_keras_fit_multiple_model()' is running on GPDB 5 and
> some versions of GPDB 6, the database will keep adding to the disk space
> (in proportion to model size) and will only release the disk space once the
> fit multiple query has completed execution. This is not the case for GPDB
> 6.5.0+ where disk space is released during the fit multiple query.
>     - DL: CUDA GPU memory cannot be released until the process holding it
> is terminated.  This process holds the GPU memory until one of the
> following two things happen: query finishes and user logs out of the
> Postgres client/session; or, query finishes and user waits for the timeout
> set by `gp_vmem_idle_resource_timeout`. The default value for this timeout
> in Greenplum is 18 sec, but it can be changed.
>     - DL: pg_temp is not allowed as an output table schema for
> madlib_keras_fit_multiple_model().
>     - Build: Enable current versions of bison
>     - Build: Add cmake variable for gppkg filename
>     - Build: Add pull request template
>
> 1.17.0 docs available here:
> http://madlib.apache.org/docs/rc/index.html
>
> For additional information, please see:
> https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.17.0
>
> Here are the release artifact details:
>
> Source release tag to be voted on: rc/1.17.0-rc2, located here:
>
> https://git-wip-us.apache.org/repos/asf?p=madlib.git;a=tag;h=refs/tags/rc/1.17.0-rc2
>
> Source release tarball can be retrieved from the following locations:
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.asc
> SHA512
> <https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.ascSHA512>
> Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-src.tar.gz.sha512
>
> Convenience binary packages can be retrieved from the following
> locations:
>
> macOS: 10.14 GPDB 5.* & 6.*, PostgreSQL 11 & 12
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.asc
> SHA512
> <https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.ascSHA512>
> Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Darwin.dmg.sha512
>
> CentOS 5.* GPDB 4.3.5+ (compiled with gcc 4.4)
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.asc
> SHA512
> <https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.ascSHA512>
> Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux-GPDB43.rpm.sha512
>
> CentOS 6 (tested on CentOS 7 as well), GPDB 5.* & 6.*, PostgreSQL 11 & 12
> (compiled with gcc 6.2)
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.asc
> SHA512
> <https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.ascSHA512>
> Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.rpm.sha512
>
> Ubuntu 18.04 GPDB 5.* & 6.*,PostgreSQL 11 & 12 (compiled with gcc 7.4)
>
> Package:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb
> PGP Signature:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.asc
> SHA512
> <https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.ascSHA512>
> Hash:
>
> https://dist.apache.org/repos/dist/dev/madlib/1.17.0.RC2/apache-madlib-1.17.0-bin-Linux.deb.sha512
>
> The PGP KEYS file used to validate the signature of the release artifacts
> is available here:
> https://dist.apache.org/repos/dist/dev/madlib/KEYS
>
> To help in tallying the vote, PMC members please be sure to indicate
> “(binding)” with the vote.
>
> [ ] +1 approve
> [ ] +0 no opinion
> [ ] -1 disapprove (and reason why)
>
> Best regards,
> Orhan Kislal <ok...@apache.org>
>