You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@community.apache.org by Ross Gardler <rg...@apache.org> on 2009/12/18 11:18:27 UTC
Academic outrech activities for Hadoop
I had a meeting with Simon Metson of University of Bristal and Steve
Loughran of HP Labs (Bristol) yesterday (both cc'd). One of the topics
of discussion was reaching out to the academic sector from the Hadoop
project.
In short it is felt that the academic sector has big data on a scale
equal to or greater than big players such as Yahoo!, Facebook and
Cloudera (e.g. Simon works on data from various sources such as
landslide modelling for cost benefit analysis and data collected from
experiments such as those conducted at the Large Hadron Collidor).
It was therefore agreed that there is a real need for the academic
sector to get to grips with Hadoop. Having large data sets and practical
applications such as these would undoubtedly help the Hadoop project in
terms of testing and validation. It's hoped that there would eventually
be code contributions from the sector too.
I suggested that the Community Development project would be the right
vehicle for this via the mentoring programme [1]. We are also thinking
of organising an event or two in the UK next year.
Since I'm not involved with the Hadoop project Steve has offered to work
with the Hadoop community to find suitable mentors. I'm posting here for
transparency and also in the hope that others in the community may be
interested in helping move this effort forwards.
I've not copied this mail to the Hadoop list, I'll let Steve and others
do that.
Steve - It may be worth subscribing to dev@community.apache.org which is
where we will be running mentoring programmes and may be able to support
some of your other activities.
Ross
[1] http://community.apache.org/mentoringprogramme.html
Re: Academic outrech activities for Hadoop
Posted by Ross Gardler <rg...@apache.org>.
On 18/12/2009 17:10, Isabel Drost wrote:
> On Fri Ross Gardler<rg...@apache.org> wrote:
>> It was therefore agreed that there is a real need for the academic
>> sector to get to grips with Hadoop. Having large data sets and
>> practical applications such as these would undoubtedly help the
>> Hadoop project in terms of testing and validation. It's hoped that
>> there would eventually be code contributions from the sector too.
>
> At least here in Berlin (TU Berlin as well as HPI Potsdam) there is
> interest in contributing back to the community (in this case the
> Hadoop and the Mahout community). Currently it is mostly student
> projects done during labs that people (lecturers as well as some
> students) are interested in contributing. I told them about the ASF
> mentoring program already.
Excellent.
> I have been talking to several local people, there are two to
> three problems usually encountered in the academic sector:
>
> 1) Doing open source work does not give you any credits for your
> scientific carrier, so there is little incentive to contribute back or
> to release your work under an open source license. I personally have no
> great idea how this problem could be fixed except through finding
> interested individuals, discussing the advantages of free software in
> general and personal participation in open source projects in
> particular.
I face this problem every day in my day job. There are many incentives
for contributing back, we just have to educated them. Some examples:
- better qualility research
- reproducable research
- sustainable research outputs
- exposure to addititional funding streams
- wider network of research collabroators
The problem is that they don't understand open source software
development. in the commercial sector the equivalent argument is:
"There is no direct credit in my annual review, so there is little
incentive to contribute back or to release my work under an open source
licence."
> 2) People are not really familiar with how to contribute to projects.
> So there is a need for mentoring, explaining and getting the word out.
Again, I deal with that daily in my day job and now we have the
Community Development project to help solve this problem. Of course this
is true of the commercial world as well as the academic world.
> 3) Some people are not familiar with the transparent, public model of
> communication in most open source projects, especially here at the ASF.
> Again, fixing this problem probably needs quite a bit of explanation
> and "getting used to".
Most people - both academic and non-academic are unfamiliar with this model.
In all cases there are lots of resources available at
http://www.oss-watch.ac.uk
These are written for the academic sector but in most cases are
applicable to the non-academic sector.
> Me personally, I made the experience, that it is comparably easy to
> get students convinced. It does get a little harder with PhD. students
> but is still possible. General lack of time when working on a PhD. adds
> to the problems.
Agreeed. The key is to find people who actually understand the benefits
and want to participate. With respect to Hadoop in the UK we have at
least one research leader who wants to go this way (Simon Metson, cc'd).
Ross
Re: Academic outrech activities for Hadoop
Posted by Isabel Drost <is...@apache.org>.
On Fri Ross Gardler <rg...@apache.org> wrote:
> It was therefore agreed that there is a real need for the academic
> sector to get to grips with Hadoop. Having large data sets and
> practical applications such as these would undoubtedly help the
> Hadoop project in terms of testing and validation. It's hoped that
> there would eventually be code contributions from the sector too.
At least here in Berlin (TU Berlin as well as HPI Potsdam) there is
interest in contributing back to the community (in this case the
Hadoop and the Mahout community). Currently it is mostly student
projects done during labs that people (lecturers as well as some
students) are interested in contributing. I told them about the ASF
mentoring program already.
I have been talking to several local people, there are two to
three problems usually encountered in the academic sector:
1) Doing open source work does not give you any credits for your
scientific carrier, so there is little incentive to contribute back or
to release your work under an open source license. I personally have no
great idea how this problem could be fixed except through finding
interested individuals, discussing the advantages of free software in
general and personal participation in open source projects in
particular.
2) People are not really familiar with how to contribute to projects.
So there is a need for mentoring, explaining and getting the word out.
3) Some people are not familiar with the transparent, public model of
communication in most open source projects, especially here at the ASF.
Again, fixing this problem probably needs quite a bit of explanation
and "getting used to".
Me personally, I made the experience, that it is comparably easy to
get students convinced. It does get a little harder with PhD. students
but is still possible. General lack of time when working on a PhD. adds
to the problems.
Isabel