You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@madlib.apache.org by Frank McQuillan <fm...@pivotal.io> on 2017/01/04 19:23:52 UTC

DRAFT Apache MADlib (incubating) podling report for Q416

Here is the draft report for Jan 2017, covering Q4 activity.

It is posted at http://wiki.apache.org/incubator/January2017

Please let me know if you have any comments or suggestions and I will
update the report.

-------------------------------

MADlib

Big Data Machine Learning in SQL for Data Scientists.

MADlib has been incubating since 2015-09-15.

Three most important issues to address in the move towards graduation:

  1. Need guidance from Incubator PMC on how to resolve the BSD licensing
switch over to Apache License.  What should be the content of the license
headers for files that were previously BSD licensed and then granted to
ASF?  Related legal-discuss threads:
http://mail-archives.apache.org/mod_mbox/www-legal-discuss/201609.mbox/%3CCALGG8z03zHhbFegXoi4fH+vXtF+9m7x6hak9RjKQjapuzi67gQ@mail.gmail.com%3E
http://mail-archives.apache.org/mod_mbox/www-legal-discuss/201603.mbox/%3C9D1AF43C-370B-4E58-B0EF-2E29D242F50B%40jaguNET.com%3E
  2. Continue to produce regular Apache (incubating) releases.
  3. Continue to execute and manage the project according to governance
model of the "Apache Way”.

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware
of?

 1. Yes-please see #1 above and provide guidance.
 2. The next release v1.10 will be the 4th as an incubating project.  After
that, the community would ideally like to move towards top level status.

How has the community developed since the last report?

  1. Some related events in Q4 2016 and upcoming:
        * Feb 4, 2017 - Presentation accepted at FOSDEM’17 Graph devroom.
Topic:  Graph Analytics on Massively Parallel Processing Databases (Frank
McQuillan)
* Dec 1, 2016 - MADLib community call.  Topic:  New features in R interface
and MADlib user survey results (hosted by Greg Chase, Orhan Kislal, Frank
McQuillan)
* Nov 16, 2016 - Presentation at PGConf Silicon Valley.  Topic:
 Distributed In-Database Machine Learning with Apache MADlib (incubating)
(Frank McQuillan)
* Nov 14, 2016 - Presentation at Apache Big Data Europe.  Topic:
 Distributed In-Database Machine Learning with Apache MADlib (incubating)
(Roman Shaposhnik)
  2. Material technical conversations on user/dev mailing lists and in the
appropriate JIRAs and pull requests.
  3. New contributors to the project have been working on KNN module and
Python interface.

How has the project developed since the last report?

  1. Active work in progress for 4th ASF release MADlib v10 scheduled for
Jan 2017.  Features include: single source shortest path graph algorithm,
completely new module for encoding categorical variables, R interface
update, grouping support in elastic net and PCA, cross validation in
elastic net, verbose output option for decision tree visualization.
  2. Mailing list activity in Q4:  227 postings to dev, 66 postings to user.

Date of last release:

  MADlib v1.9.1 on 9/19/16.

When were the last committers or PMC members elected:

  Orhan Kislal on 9/7/16 and Nandish Jayaram on 9/7/16.