You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@madlib.apache.org by Frank McQuillan <fm...@pivotal.io> on 2016/11/07 19:55:41 UTC

Apache MADlib user survey results

We recently ran a survey asking MADlib users about a wide range of topics
pertaining to this open source project, including desired new features.
Thank you to all who responded.

You are welcome to view the survey results:
http://madlib.incubator.apache.org/community-artifacts/Apache-MADlib-user-survey-results-Oct-2016.pdf
and make any comments or suggestions.

Quick summary:

* Received ~40 responses from 27 different companies
* ~50% of respondents have 1 year or less of MADlib use
* Fraud detection is the most common use case
* Regression (various), clustering and random forest are the most commonly
used MADlib algorithms
* Gradient boosting is the most commonly requested new algorithm
* Users prefer new algorithms more than improvements to existing algorithms
by a 2:1 margin
* Improved documentation/examples and better performance are the biggest
concerns
* The most common other tools used by respondents are R, Spark and Python
(and associated libraries)

Frank