You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Kai Jiang <ji...@gmail.com> on 2016/02/05 07:53:57 UTC

[GSoC] Interested in GSoC 2016 ideas

Hello All,

I am Kai Jiang, a master student majoring in Computer Science. Machine
Learning and Distributed
System are my interests. Due to that, I've been contributing to Spark
codebase since last year. My
Pull Requests are related to MLlib, PySpark and SQL.(
https://github.com/apache/spark/pulls/vectorijk)

This year, I really want to extend my future contribution with Spark into a
GSoC project. Although the
list of GSoC organizations this year hasn't been announced yet, it is
highly possible that Apache
Software Foundation would be accepted based on organization list before.
Thus, I was wondering if
there are some specific ideas, issues or suggestions regarding MLlib, SQL
or others could be
gathered into a project. Meanwhile, I also noticed that Spark 2.0 would be
a big version in the near
future. After looking into the MLlib 2.0 Roadmap
<https://issues.apache.org/jira/browse/SPARK-12626>, I found there are many
issues I am interested in (i.e
Python/SparkR API for ML, PMML export, etc.). If community has other ideas,
I am very willing to
work on some issues before GSoC and get started with something new during
GSoC.

Looking forward to hearing from you!


Best,
Kai.
github.com/vectorijk