You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Alejandro Alcalde <al...@gmail.com> on 2018/10/16 08:05:40 UTC

Contribution to FlinkML

Hello all

I've been some time developing a library for data preprocessing in Flink.

I reach out to you because this library is almost finished and this month I
will be submitting a paper to a journal (pre-print available at arxiv:
https://arxiv.org/abs/1810.06021)

I've checked Flink's roadmap (
https://cwiki.apache.org/confluence/display/FLINK/FlinkML%3A+Vision+and+Roadmap)
and saw you want to implement Dimensionality reduction. My library has six
preprocessing algorithms, three Discretizers and three feature selection
methods. I was wondering if there is any possibility to integrate them into
Flink. Also, I will be willing to make any necessary changes to the
algorithms, if you consider I could implemented in more efficient ways.
This will allow me also to  improve my knowledge and skill with Flink.

The code is at https://github.com/elbaulp/DPASF

Hoping to hear from you soon, best regards.

*-- Alejandro Alcalde - elbauldelprogramador.com
<http://elbauldelprogramador.com>*