You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Makoto Yui <yu...@gmail.com> on 2014/09/10 09:48:18 UTC

[ANN] Hivemall (a scalable machine learning library for Apache Hive) v0.3-beta

Hello all,

We have released a newer version of Hivemall, v0.3-beta2.

Hivemall is an open-source implementation of a scalable machine learning
that runs on Hive/Hadoop.

  https://github.com/myui/hivemall
  http://bit.ly/hivemall-hadoopsummit14 (slide at Hadoop Summit'14)

Hivemall is easy to use if you have a Hive environment because every
machine learning step is done within HiveQL.

In the latest release (v0.3), we have supported the following state of
the art convex optimization algorithms (please refer the project site
for the complete list of supported algorithms):

  o AdaGrad
  o AdaGradRDA
  o AdaDelta

Moreover, Hivemall v0.3 now supports parameter mixing for better
stable/prediction performance and fast convergence of a learning process.
https://github.com/myui/hivemall/wiki/How-to-use-Model-Mixing

With the MIX protocol, distributed learners (run as distinct Hadoop
tasks) communicate with each other by using an external communication
support service.

By using the MIX protocol (and Hivemall's amplifier method), iterations
are no more mandatory and machine learning perfectly runs on the plain
Hadoop/Hive. Hivemall runs on Tez as well.


Hope you enjoy the release! Feedback and pull requests are welcome.

Thanks,
Makoto