You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@systemml.apache.org by fs...@posteo.de on 2017/01/06 19:54:57 UTC

[Discuss] Google Summer of Code (GSoc) 2017

Hi all,

as it just came up on the ML, I want to bring this up again for general 
discussion. I think we should try to get at least one or two students 
for this year's GSOC. If you have never heard of GSOC, look here: 
http://write.flossmanuals.net/gsoc-mentoring/what-is-gsoc/ and here: 
https://developers.google.com/open-source/gsoc/

Applications for organizations open on January 19th and it is a great 
way of introducing new people to the SystemML development and get more 
contributors.
To apply, we need to propose projects for a 4-month period in which a 
student works on them full time (May - August). Each proposed project 
needs one community member to mentor it - in the end Google decides how 
many students each project gets, depending of the quality of the 
proposed ideas.
To successfully apply we need (1) good ideas for projects and (2) people 
willing to mentor those ideas.
For an initial brainstorming I suggest that we first figure out if we 
want to participate (which mainly means we need to find people willing 
to mentor projects) and then start collecting ideas. Ideas can be 
anything from infrastructure, to core development or implementation of 
new algorithms.

Here is a quick example of how a project proposal could look like:


Title: Performance Benchmarks and Experiments

Description: To make decisions about new features and the evaluation of 
old assumptions we need up-to-date performance statistics on multiple 
levels of the systems and on different architectures (local, 
distributed, GPU). The systematic evaluation of performance can be 
measured with performance tests and micro-benchmarks. In this way, 
changes to the project or alternative implementations (i.g. for 
low-level linear algebra backends) can be systematically evaluated and 
compared. (Semi-) Automated benchmarks can help make these decisions and 
challenge assumptions that were made during earlier development. In the 
course of this project, the student should build a benchmark 
infrastructure and conduct experiments, that compare different choices 
in critical parts (sparsity thresholds, BLAS backends, optimization 
decisions, etc.).

Expected Outcome: A benchmark suite than can be used to detect 
regressions or improvements in critical components of the system.

Skills required: Java/Scala, some knowledge of benchmarking; preferred: 
knowledge about high-performance-computing and/or distributed systems.

Possible Mentors: Matthias, Niketan, Nakul, Felix


Let's decide on if we want to apply as an organization!

- Felix

Re: [Discuss] Google Summer of Code (GSoc) 2017

Posted by Jeremy Anderson <je...@objectadjective.com>.

+1

I'd love to extend this to design as well. I'll dig into this and come back.

- Jeremy

...........................

Jeremy Anderson

Github: https://github.com/objectadjective
Twitter: https://twitter.com/ObjectAdjective
LinkedIn: http://www.linkedin.com/in/objectadjective

On 6 January 2017 at 12:12, Mike Dusenberry <du...@gmail.com> wrote:

> +1  We should definitely submit a few good project proposals, and
> particularly those that aim to improve the ability of the user to work on a
> wide range of ML problems in a simple and easy manner on top of Spark.
> This could include: building out a full ML demo to solve a real,
> large-scale problem that would benefit from a distributed approach; overall
> performance improvements that address a full class, or wider area, of ML
> algorithms, rather than a single, specific script; infrastructure for
> [performance] testing, and identification of wide areas of improvement
> (your example proposal fits here, and is quite nice!); helping with
> building out fully-featured, clean, well-tested DSLs in Python & Scala
> (we've started, but it would be good to continue stressing them -- we could
> even aim to replace DML with the DSLs); etc.  I like the example proposal
> that you've given since it would be beneficial to the entire project,
> rather than a single, isolated area.
>
> - Mike
>
>
> --
>
> Michael W. Dusenberry
> GitHub: github.com/dusenberrymw
> LinkedIn: linkedin.com/in/mikedusenberry
>
> On Fri, Jan 6, 2017 at 11:57 AM, Madison Myers <ma...@gmail.com>
> wrote:
>
> > +1 I think it's a great idea, Felix
> >
> > On Fri, Jan 6, 2017 at 11:54 AM, <fs...@posteo.de> wrote:
> >
> > > Hi all,
> > >
> > > as it just came up on the ML, I want to bring this up again for general
> > > discussion. I think we should try to get at least one or two students
> for
> > > this year's GSOC. If you have never heard of GSOC, look here:
> > > http://write.flossmanuals.net/gsoc-mentoring/what-is-gsoc/ and here:
> > > https://developers.google.com/open-source/gsoc/
> > >
> > > Applications for organizations open on January 19th and it is a great
> way
> > > of introducing new people to the SystemML development and get more
> > > contributors.
> > > To apply, we need to propose projects for a 4-month period in which a
> > > student works on them full time (May - August). Each proposed project
> > needs
> > > one community member to mentor it - in the end Google decides how many
> > > students each project gets, depending of the quality of the proposed
> > ideas.
> > > To successfully apply we need (1) good ideas for projects and (2)
> people
> > > willing to mentor those ideas.
> > > For an initial brainstorming I suggest that we first figure out if we
> > want
> > > to participate (which mainly means we need to find people willing to
> > mentor
> > > projects) and then start collecting ideas. Ideas can be anything from
> > > infrastructure, to core development or implementation of new
> algorithms.
> > >
> > > Here is a quick example of how a project proposal could look like:
> > >
> > >
> > > Title: Performance Benchmarks and Experiments
> > >
> > > Description: To make decisions about new features and the evaluation of
> > > old assumptions we need up-to-date performance statistics on multiple
> > > levels of the systems and on different architectures (local,
> distributed,
> > > GPU). The systematic evaluation of performance can be measured with
> > > performance tests and micro-benchmarks. In this way, changes to the
> > project
> > > or alternative implementations (i.g. for low-level linear algebra
> > backends)
> > > can be systematically evaluated and compared. (Semi-) Automated
> > benchmarks
> > > can help make these decisions and challenge assumptions that were made
> > > during earlier development. In the course of this project, the student
> > > should build a benchmark infrastructure and conduct experiments, that
> > > compare different choices in critical parts (sparsity thresholds, BLAS
> > > backends, optimization decisions, etc.).
> > >
> > > Expected Outcome: A benchmark suite than can be used to detect
> > regressions
> > > or improvements in critical components of the system.
> > >
> > > Skills required: Java/Scala, some knowledge of benchmarking; preferred:
> > > knowledge about high-performance-computing and/or distributed systems.
> > >
> > > Possible Mentors: Matthias, Niketan, Nakul, Felix
> > >
> > >
> > > Let's decide on if we want to apply as an organization!
> > >
> > > - Felix
> > >
> >
> >
> >
> > --
> > *Madison J. Myers*
> > *--------------------------*
> > *Spark Technology Center, IBM Watson*
> > *UC Berkeley, Master of Information & Data Science '17*
> >
> > *King's College London, MA Political Science '14*
> > *New York University, BA Political Science '12*
> >
> >    -
> >       LinkedIn <http://linkedin.com/in/madisonjmyers>
> >
>

Re: [Discuss] Google Summer of Code (GSoc) 2017

Posted by Mike Dusenberry <du...@gmail.com>.

+1  We should definitely submit a few good project proposals, and
particularly those that aim to improve the ability of the user to work on a
wide range of ML problems in a simple and easy manner on top of Spark.
This could include: building out a full ML demo to solve a real,
large-scale problem that would benefit from a distributed approach; overall
performance improvements that address a full class, or wider area, of ML
algorithms, rather than a single, specific script; infrastructure for
[performance] testing, and identification of wide areas of improvement
(your example proposal fits here, and is quite nice!); helping with
building out fully-featured, clean, well-tested DSLs in Python & Scala
(we've started, but it would be good to continue stressing them -- we could
even aim to replace DML with the DSLs); etc.  I like the example proposal
that you've given since it would be beneficial to the entire project,
rather than a single, isolated area.

- Mike


--

Michael W. Dusenberry
GitHub: github.com/dusenberrymw
LinkedIn: linkedin.com/in/mikedusenberry

On Fri, Jan 6, 2017 at 11:57 AM, Madison Myers <ma...@gmail.com>
wrote:

> +1 I think it's a great idea, Felix
>
> On Fri, Jan 6, 2017 at 11:54 AM, <fs...@posteo.de> wrote:
>
> > Hi all,
> >
> > as it just came up on the ML, I want to bring this up again for general
> > discussion. I think we should try to get at least one or two students for
> > this year's GSOC. If you have never heard of GSOC, look here:
> > http://write.flossmanuals.net/gsoc-mentoring/what-is-gsoc/ and here:
> > https://developers.google.com/open-source/gsoc/
> >
> > Applications for organizations open on January 19th and it is a great way
> > of introducing new people to the SystemML development and get more
> > contributors.
> > To apply, we need to propose projects for a 4-month period in which a
> > student works on them full time (May - August). Each proposed project
> needs
> > one community member to mentor it - in the end Google decides how many
> > students each project gets, depending of the quality of the proposed
> ideas.
> > To successfully apply we need (1) good ideas for projects and (2) people
> > willing to mentor those ideas.
> > For an initial brainstorming I suggest that we first figure out if we
> want
> > to participate (which mainly means we need to find people willing to
> mentor
> > projects) and then start collecting ideas. Ideas can be anything from
> > infrastructure, to core development or implementation of new algorithms.
> >
> > Here is a quick example of how a project proposal could look like:
> >
> >
> > Title: Performance Benchmarks and Experiments
> >
> > Description: To make decisions about new features and the evaluation of
> > old assumptions we need up-to-date performance statistics on multiple
> > levels of the systems and on different architectures (local, distributed,
> > GPU). The systematic evaluation of performance can be measured with
> > performance tests and micro-benchmarks. In this way, changes to the
> project
> > or alternative implementations (i.g. for low-level linear algebra
> backends)
> > can be systematically evaluated and compared. (Semi-) Automated
> benchmarks
> > can help make these decisions and challenge assumptions that were made
> > during earlier development. In the course of this project, the student
> > should build a benchmark infrastructure and conduct experiments, that
> > compare different choices in critical parts (sparsity thresholds, BLAS
> > backends, optimization decisions, etc.).
> >
> > Expected Outcome: A benchmark suite than can be used to detect
> regressions
> > or improvements in critical components of the system.
> >
> > Skills required: Java/Scala, some knowledge of benchmarking; preferred:
> > knowledge about high-performance-computing and/or distributed systems.
> >
> > Possible Mentors: Matthias, Niketan, Nakul, Felix
> >
> >
> > Let's decide on if we want to apply as an organization!
> >
> > - Felix
> >
>
>
>
> --
> *Madison J. Myers*
> *--------------------------*
> *Spark Technology Center, IBM Watson*
> *UC Berkeley, Master of Information & Data Science '17*
>
> *King's College London, MA Political Science '14*
> *New York University, BA Political Science '12*
>
>    -
>       LinkedIn <http://linkedin.com/in/madisonjmyers>
>

Re: [Discuss] Google Summer of Code (GSoc) 2017

Posted by Madison Myers <ma...@gmail.com>.

+1 I think it's a great idea, Felix

On Fri, Jan 6, 2017 at 11:54 AM, <fs...@posteo.de> wrote:

> Hi all,
>
> as it just came up on the ML, I want to bring this up again for general
> discussion. I think we should try to get at least one or two students for
> this year's GSOC. If you have never heard of GSOC, look here:
> http://write.flossmanuals.net/gsoc-mentoring/what-is-gsoc/ and here:
> https://developers.google.com/open-source/gsoc/
>
> Applications for organizations open on January 19th and it is a great way
> of introducing new people to the SystemML development and get more
> contributors.
> To apply, we need to propose projects for a 4-month period in which a
> student works on them full time (May - August). Each proposed project needs
> one community member to mentor it - in the end Google decides how many
> students each project gets, depending of the quality of the proposed ideas.
> To successfully apply we need (1) good ideas for projects and (2) people
> willing to mentor those ideas.
> For an initial brainstorming I suggest that we first figure out if we want
> to participate (which mainly means we need to find people willing to mentor
> projects) and then start collecting ideas. Ideas can be anything from
> infrastructure, to core development or implementation of new algorithms.
>
> Here is a quick example of how a project proposal could look like:
>
>
> Title: Performance Benchmarks and Experiments
>
> Description: To make decisions about new features and the evaluation of
> old assumptions we need up-to-date performance statistics on multiple
> levels of the systems and on different architectures (local, distributed,
> GPU). The systematic evaluation of performance can be measured with
> performance tests and micro-benchmarks. In this way, changes to the project
> or alternative implementations (i.g. for low-level linear algebra backends)
> can be systematically evaluated and compared. (Semi-) Automated benchmarks
> can help make these decisions and challenge assumptions that were made
> during earlier development. In the course of this project, the student
> should build a benchmark infrastructure and conduct experiments, that
> compare different choices in critical parts (sparsity thresholds, BLAS
> backends, optimization decisions, etc.).
>
> Expected Outcome: A benchmark suite than can be used to detect regressions
> or improvements in critical components of the system.
>
> Skills required: Java/Scala, some knowledge of benchmarking; preferred:
> knowledge about high-performance-computing and/or distributed systems.
>
> Possible Mentors: Matthias, Niketan, Nakul, Felix
>
>
> Let's decide on if we want to apply as an organization!
>
> - Felix
>



-- 
*Madison J. Myers*
*--------------------------*
*Spark Technology Center, IBM Watson*
*UC Berkeley, Master of Information & Data Science '17*

*King's College London, MA Political Science '14*
*New York University, BA Political Science '12*

   -
      LinkedIn <http://linkedin.com/in/madisonjmyers>