You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tuscany.apache.org by Eranda Sooriyabandara <07...@gmail.com> on 2011/03/16 08:28:00 UTC

GSoC Project Proposal : Develop a 'NoSQL' Datastore component for Apache Cassandra, CouchDB, Hadoop/Hbase

Hi all,
I started writing my GSoC 2011 project proposal for Develop a 'NoSQL'
Datastore component for Apache Cassandra, CouchDB, Hadoop/Hbase at Tuscany
wiki [1]. Please feel free to add some comments and do changes if needed.
Your feedbacks are highly appreciated.

thanks
Eranda

[1].
https://cwiki.apache.org/confluence/display/TUSCANYWIKI/Develop+a+NoSQL+Datastore+component

Re: GSoC Project Proposal : Develop a 'NoSQL' Datastore component for Apache Cassandra, CouchDB, Hadoop/Hbase

Posted by Eranda Sooriyabandara <07...@gmail.com>.
Hi Giorgio,

On Sat, Mar 26, 2011 at 12:43 AM, Giorgio Zoppi <gi...@gmail.com>wrote:

> Dear Eranda,
> it is not up to me to decide.


Yes, will see what others say.

I wish you all the best and that you
> enjoy with Tuscany.


thanks Giorgio, I am really enjoying with the works of Tuscany.

Thanks
Eranda

Re: GSoC Project Proposal : Develop a 'NoSQL' Datastore component for Apache Cassandra, CouchDB, Hadoop/Hbase

Posted by Giorgio Zoppi <gi...@gmail.com>.
Dear Eranda,
it is not up to me to decide. I wish you all the best and that you
enjoy with Tuscany.
Giorgio.

2011/3/25 Eranda Sooriyabandara <07...@gmail.com>:
> Hi Giorgio,
> It's great to here that this project want to continue and I am hoping
> to continue with it.
> thanks
> Eranda
> P.S. It's 12.30 am here



-- 
Quiero ser el rayo de sol que cada día te despierta
para hacerte respirar y vivir en me.
"Favola -Moda".

Re: GSoC Project Proposal : Develop a 'NoSQL' Datastore component for Apache Cassandra, CouchDB, Hadoop/Hbase

Posted by Eranda Sooriyabandara <07...@gmail.com>.
Hi Giorgio,
It's great to here that this project want to continue and I am hoping
to continue with it.

thanks
Eranda

P.S. It's 12.30 am here

Re: GSoC Project Proposal : Develop a 'NoSQL' Datastore component for Apache Cassandra, CouchDB, Hadoop/Hbase

Posted by Giorgio Zoppi <gi...@gmail.com>.
2011/3/25 Eranda Sooriyabandara <07...@gmail.com>:
> Hi Raymand,
> I found out that the components of Spring framework can be pluggable to
> Apache Tuscany using <implementation.spring>. So does that mean that this
> project does not have a value, because of Spring data has the same
> functionality or it need to be continue since this build as a Tuscany SCA
> component?
No absolutely. With this project we avoid the spring layer. Think
about Tuscany as an integrator of different tecnologies
component flavours, I mean with your project we would be able to
abstract each datastore. Dont loose the big picture:).
Uhm I am still at work at 8 pm.
BR,
Giorgio.


-- 
Quiero ser el rayo de sol que cada día te despierta
para hacerte respirar y vivir en me.
"Favola -Moda".

Re: GSoC Project Proposal : Develop a 'NoSQL' Datastore component for Apache Cassandra, CouchDB, Hadoop/Hbase

Posted by Eranda Sooriyabandara <07...@gmail.com>.
Hi Raymand,
I found out that the components of Spring framework can be pluggable to
Apache Tuscany using <implementation.spring>. So does that mean that this
project does not have a value, because of Spring data has the same
functionality or it need to be continue since this build as a Tuscany SCA
component?

thanks
Eranda

Re: GSoC Project Proposal : Develop a 'NoSQL' Datastore component for Apache Cassandra, CouchDB, Hadoop/Hbase

Posted by Raymond Feng <en...@gmail.com>.
FYI: There is a similar project at Spring:

http://www.springsource.org/spring-data

Thanks,
Raymond
________________________________________________________________ 
Raymond Feng
rfeng@apache.org
Apache Tuscany PMC member and committer: tuscany.apache.org
Co-author of Tuscany SCA In Action book: www.tuscanyinaction.com
Personal Web Site: www.enjoyjava.com
________________________________________________________________

On Mar 25, 2011, at 10:58 AM, Eranda Sooriyabandara wrote:

> Hi all,
> I competed the implementation plan of the project proposal of the project Develop a 'NoSQL' Datastore component for Apache Cassandra, CouchDB, Hadoop/Hbase, you can find at [1]. Please let me know your precious ideas regarding it.
> 
> thanks
> Eranda
> 
> [1]. https://cwiki.apache.org/confluence/display/TUSCANYWIKI/Develop+a+NoSQL+Datastore+component


Re: GSoC Project Proposal : Develop a 'NoSQL' Datastore component for Apache Cassandra, CouchDB, Hadoop/Hbase

Posted by Eranda Sooriyabandara <07...@gmail.com>.
Thanks Florian, Actually I am very impressed about this project since this
involves new technologies which I haven't learn before.

Eranda

Re: GSoC Project Proposal : Develop a 'NoSQL' Datastore component for Apache Cassandra, CouchDB, Hadoop/Hbase

Posted by Florian Moga <mo...@gmail.com>.
Hi Eranda,

This is definitely a good start. It seems to me as well that the project is
very consistent for the limited timeframe of GSOC so it might be better to
focus on the core integration first and then add things incrementally (even
after GSOC if you are willing to contribute).

I'm glad you are excited about the project.

Good luck,

Florian


On Thu, Mar 31, 2011 at 10:14 AM, Eranda Sooriyabandara
<07...@gmail.com>wrote:

> Hi Jean-Sebastian,
>
>
>> - In several places you mention 'a SCA portable data store component'
>> or 'this component' for example. I suggest to do a pass over the text
>> of the proposal and make really clear that there are multiple
>> components, one (or two... see my next question) per data store type,
>> to avoid any confusion.
>>
>> - Are you planning to have just one component per data store type? or
>> two components per data store type? maybe one component wrapping and
>> providing as a service the technical database API for the specific
>> datastore, and a second component providing that uniform REST data
>> access on top? I was not sure of your intention after reading the
>> description of your component reference... If I had to pick a design
>> I'd probably choose two separate components, but I leave that decision
>> to you, and perhaps this is something that you don't even need to
>> decide now... but only after you actually investigate the various
>> APIs. What do you think?
>>
>
> I have the same idea as you mentioned, having two separate components. One
> for wrapping and the other one for provide uniform REST data access on the
> top. This will help us to make  a composite component which has all the
> components. Also if we need to change the model we can change that
> accordingly. I'll make the confusion correct in the project proposal.
>
>
>
>> - I'm not sure if having interaction policies to handle database
>> authentication is going to be too much additional work for this
>> project. You already have to deal with the integration of three
>> databases, which is going to be a lot of work by itself. What do
>> others in the team think?
>>
>
> Yes I agreed there are lots of work to be done if we provide this
> functionality since we have to use (or create) components which give the
> identity services. Since we don't have much time, I think we should
> postponed it to after summer of code.
>
> - I'd suggest to include a few things in the Apr 25 - May 23 phase:
>> a) define a common tutorial / sample scenario that you're going to
>> implement over the various databases in the next phases
>> b) start to hack small parts of the scenario over the databases,
>> without Tuscany in the picture, as an exercise to learn their APIs
>> c) start to put together the database independent parts of the
>> scenario in Tuscany, and mock up the database access for this
>>
>> That's a good idea. Thanks for the suggestions and I will modify the
> proposal according to them.
>
>
>> I'm hoping that doing that up front will help provide some context
>> while you're experimenting with the database APIs and prepare you
>> better to shape up the common service interface you're planning to
>> design in phase 2. What do you think?
>>
>> I somewhat families with the API of the Apache Cassandra and I am
> currently looking at the other APIs and I will come up with a primary level
> interface soon.
>
> Thanks
> Eranda
>
>

Re: GSoC Project Proposal : Develop a 'NoSQL' Datastore component for Apache Cassandra, CouchDB, Hadoop/Hbase

Posted by Eranda Sooriyabandara <07...@gmail.com>.
Hi Jean-Sebastian,


> - In several places you mention 'a SCA portable data store component'
> or 'this component' for example. I suggest to do a pass over the text
> of the proposal and make really clear that there are multiple
> components, one (or two... see my next question) per data store type,
> to avoid any confusion.
>
> - Are you planning to have just one component per data store type? or
> two components per data store type? maybe one component wrapping and
> providing as a service the technical database API for the specific
> datastore, and a second component providing that uniform REST data
> access on top? I was not sure of your intention after reading the
> description of your component reference... If I had to pick a design
> I'd probably choose two separate components, but I leave that decision
> to you, and perhaps this is something that you don't even need to
> decide now... but only after you actually investigate the various
> APIs. What do you think?
>

I have the same idea as you mentioned, having two separate components. One
for wrapping and the other one for provide uniform REST data access on the
top. This will help us to make  a composite component which has all the
components. Also if we need to change the model we can change that
accordingly. I'll make the confusion correct in the project proposal.



> - I'm not sure if having interaction policies to handle database
> authentication is going to be too much additional work for this
> project. You already have to deal with the integration of three
> databases, which is going to be a lot of work by itself. What do
> others in the team think?
>

Yes I agreed there are lots of work to be done if we provide this
functionality since we have to use (or create) components which give the
identity services. Since we don't have much time, I think we should
postponed it to after summer of code.

- I'd suggest to include a few things in the Apr 25 - May 23 phase:
> a) define a common tutorial / sample scenario that you're going to
> implement over the various databases in the next phases
> b) start to hack small parts of the scenario over the databases,
> without Tuscany in the picture, as an exercise to learn their APIs
> c) start to put together the database independent parts of the
> scenario in Tuscany, and mock up the database access for this
>
> That's a good idea. Thanks for the suggestions and I will modify the
proposal according to them.


> I'm hoping that doing that up front will help provide some context
> while you're experimenting with the database APIs and prepare you
> better to shape up the common service interface you're planning to
> design in phase 2. What do you think?
>
> I somewhat families with the API of the Apache Cassandra and I am currently
looking at the other APIs and I will come up with a primary level interface
soon.

Thanks
Eranda

Re: GSoC Project Proposal : Develop a 'NoSQL' Datastore component for Apache Cassandra, CouchDB, Hadoop/Hbase

Posted by Jean-Sebastien Delfino <js...@gmail.com>.
Hi,

This is starting to look good, I think.

Here are a few questions and comments:

- In several places you mention 'a SCA portable data store component'
or 'this component' for example. I suggest to do a pass over the text
of the proposal and make really clear that there are multiple
components, one (or two... see my next question) per data store type,
to avoid any confusion.

- Are you planning to have just one component per data store type? or
two components per data store type? maybe one component wrapping and
providing as a service the technical database API for the specific
datastore, and a second component providing that uniform REST data
access on top? I was not sure of your intention after reading the
description of your component reference... If I had to pick a design
I'd probably choose two separate components, but I leave that decision
to you, and perhaps this is something that you don't even need to
decide now... but only after you actually investigate the various
APIs. What do you think?

- I'm not sure if having interaction policies to handle database
authentication is going to be too much additional work for this
project. You already have to deal with the integration of three
databases, which is going to be a lot of work by itself. What do
others in the team think?

- I'd suggest to include a few things in the Apr 25 - May 23 phase:
a) define a common tutorial / sample scenario that you're going to
implement over the various databases in the next phases
b) start to hack small parts of the scenario over the databases,
without Tuscany in the picture, as an exercise to learn their APIs
c) start to put together the database independent parts of the
scenario in Tuscany, and mock up the database access for this

I'm hoping that doing that up front will help provide some context
while you're experimenting with the database APIs and prepare you
better to shape up the common service interface you're planning to
design in phase 2. What do you think?

I'd also like some input from the other folks in the Tuscany community.
So, if you guys could help review the proposal too, that'd be great... Thanks!

--
Jean-Sebastien

On Fri, Mar 25, 2011 at 10:58 AM, Eranda Sooriyabandara
<07...@gmail.com> wrote:
> Hi all,
> I competed the implementation plan of the project proposal of the
> project Develop a 'NoSQL' Datastore component for Apache Cassandra, CouchDB,
> Hadoop/Hbase, you can find at [1]. Please let me know your precious ideas
> regarding it.
> thanks
> Eranda
> [1]. https://cwiki.apache.org/confluence/display/TUSCANYWIKI/Develop+a+NoSQL+Datastore+component



-- 
Jean-Sebastien

Re: GSoC Project Proposal : Develop a 'NoSQL' Datastore component for Apache Cassandra, CouchDB, Hadoop/Hbase

Posted by Eranda Sooriyabandara <07...@gmail.com>.
Hi all,
I competed the implementation plan of the project proposal of the
project Develop
a 'NoSQL' Datastore component for Apache Cassandra, CouchDB, Hadoop/Hbase, you
can find at [1]. Please let me know your precious ideas regarding it.

thanks
Eranda

[1].
https://cwiki.apache.org/confluence/display/TUSCANYWIKI/Develop+a+NoSQL+Datastore+component

Re: GSoC Project Proposal : Develop a 'NoSQL' Datastore component for Apache Cassandra, CouchDB, Hadoop/Hbase

Posted by Eranda Sooriyabandara <07...@gmail.com>.
Hi Jean-Sebastien,
I updated the project proposal adding some of your ideas, links and
Florian's previous project proposal and hoping to come up with a more
advanced proposal. Please let me know your ideas on the current proposal
(specially on timeline). It will be a great help for me to improve my
proposal.
Also before completing the "overview of the implementation plan" there are
some problems I need to discuss with.

   1. My plan is to create separate components for each database and make a
   composite component out of it. What do you think?
   2. Final global API for all the database components must be decided. Any
   ideas for it?
   3. Need to decide the order which we implement the components. I think we
   should start with Apache Cassandra since I have some knowledge in it.
   4. In Apache Cassandra there are lots of client APIs. Thrift is the basic
   and there are some other higher-level APIs (Java) like Hector, Pelops,
   Kundera, Datanucleus-JDO. There are some concerns with using these APIs like
   how bulky is the components would be since there are lots of dependencies.

Hope you have some ideas for them.

thanks
Eranda

Re: GSoC Project Proposal : Develop a 'NoSQL' Datastore component for Apache Cassandra, CouchDB, Hadoop/Hbase

Posted by Jean-Sebastien Delfino <js...@gmail.com>.
On 03/21/2011 01:44 AM, Jean-Sebastien Delfino wrote:
> On 03/16/2011 12:28 AM, Eranda Sooriyabandara wrote:
>> Hi all,
>> I started writing my GSoC 2011 project proposal for Develop a 'NoSQL'
>> Datastore component for Apache Cassandra, CouchDB, Hadoop/Hbase at
>> Tuscany
>> wiki [1]. Please feel free to add some comments and do changes if needed.
>> Your feedbacks are highly appreciated.
>>
>> thanks
>> Eranda
>>
>> [1].
>> https://cwiki.apache.org/confluence/display/TUSCANYWIKI/Develop+a+NoSQL+Datastore+component
>>
>>
>
> Hi Eranda,
>
> This is a good start.
>
> I've looked at the GSoC program pages and here are some links that
> should help you with your proposal:
>
> The GSoC FAQ, with some example applications:
> http://www.google-melange.com/document/show/gsoc_program/google/gsoc2011/faqs#student_app
>
>
> The GSoC student guide, in particular the section on writing a proposal:
> http://www.booki.cc/gsocstudentguide/_v/1.0/writing-a-proposal/
>
> The resources page, including the GSoC mailing lists that you should
> subscribe to:
> http://www.booki.cc/gsocstudentguide/_v/1.0/additional-resources/
>
> Also searching for 'gsoc proposal examples' and 'gsoc proposal
> guidelines' on Google returns plenty of good examples and templates,
> from the PHP, Chromium, Gnome, BSD organizations etc.
>
> I think it'd be good to include include in your proposal an overview of
> an implementation plan, a description of the deliverables, a timeline, a
> short bio, and develop why you're interested in the project and why
> you're the best candidate for it.
>
> Hope this helps

One more thing:

You've started to work with the Tuscany community and this is really great!

Since this project is also about integrating Tuscany with other Apache 
projects, Cassandra, Hadoop, CouchDB, etc, it might be good to get your 
proposal reviewed by them as well.

Looks like you're already talking on the Cassandra mailing list, perhaps 
point them to your proposal and discuss it there too...

-- 
Jean-Sebastien

Re: GSoC Project Proposal : Develop a 'NoSQL' Datastore component for Apache Cassandra, CouchDB, Hadoop/Hbase

Posted by Jean-Sebastien Delfino <js...@apache.org>.
On 03/16/2011 12:28 AM, Eranda Sooriyabandara wrote:
> Hi all,
> I started writing my GSoC 2011 project proposal for Develop a 'NoSQL'
> Datastore component for Apache Cassandra, CouchDB, Hadoop/Hbase at Tuscany
> wiki [1]. Please feel free to add some comments and do changes if needed.
> Your feedbacks are highly appreciated.
>
> thanks
> Eranda
>
> [1].
> https://cwiki.apache.org/confluence/display/TUSCANYWIKI/Develop+a+NoSQL+Datastore+component
>

Hi Eranda,

This is a good start.

I've looked at the GSoC program pages and here are some links that 
should help you with your proposal:

The GSoC FAQ, with some example applications:
http://www.google-melange.com/document/show/gsoc_program/google/gsoc2011/faqs#student_app

The GSoC student guide, in particular the section on writing a proposal:
http://www.booki.cc/gsocstudentguide/_v/1.0/writing-a-proposal/

The resources page, including the GSoC mailing lists that you should 
subscribe to:
http://www.booki.cc/gsocstudentguide/_v/1.0/additional-resources/

Also searching for 'gsoc proposal examples' and 'gsoc proposal 
guidelines' on Google returns plenty of good examples and templates, 
from the PHP, Chromium, Gnome, BSD organizations etc.

I think it'd be good to include include in your proposal an overview of 
an implementation plan, a description of the deliverables, a timeline, a 
short bio, and develop why you're interested in the project and why 
you're the best candidate for it.

Hope this helps
-- 
Jean-Sebastien

Re: GSoC Project Proposal : Develop a 'NoSQL' Datastore component for Apache Cassandra, CouchDB, Hadoop/Hbase

Posted by Florian Moga <mo...@gmail.com>.
Cool. It might also be a good idea for you to post a comment on JIRA that
you're interested in the project and you're working on the proposal. The
list of accepted organizations has just been posted and Apache's project
ideas list will be publicly available soon.


On Fri, Mar 18, 2011 at 9:13 PM, Eranda Sooriyabandara <07...@gmail.com>wrote:

> Hi Florian,
>
> We now have a dedicated page for GSoC 2011 on the wiki [1]. Would you like
>> to move your application page as a child page of [2]? That's where all
>> Tuscany student applications will reside.
>>
>
> This idea is great and I moved my GSoC application as a child of GSoC 2011
> Application.
> thanks
> Eranda
>

Re: GSoC Project Proposal : Develop a 'NoSQL' Datastore component for Apache Cassandra, CouchDB, Hadoop/Hbase

Posted by Eranda Sooriyabandara <07...@gmail.com>.
Hi Florian,

We now have a dedicated page for GSoC 2011 on the wiki [1]. Would you like
> to move your application page as a child page of [2]? That's where all
> Tuscany student applications will reside.
>

This idea is great and I moved my GSoC application as a child of GSoC 2011
Application.
thanks
Eranda

Re: GSoC Project Proposal : Develop a 'NoSQL' Datastore component for Apache Cassandra, CouchDB, Hadoop/Hbase

Posted by Florian Moga <mo...@gmail.com>.
Hi Eranda,

We now have a dedicated page for GSoC 2011 on the wiki [1]. Would you like
to move your application page as a child page of [2]? That's where all
Tuscany student applications will reside.

Thanks,

Florian

[1] https://cwiki.apache.org/confluence/display/TUSCANYWIKI/GSoC+2011
[2]
https://cwiki.apache.org/confluence/display/TUSCANYWIKI/GSoC+2011+Applications


On Wed, Mar 16, 2011 at 9:28 AM, Eranda Sooriyabandara <07...@gmail.com>wrote:

> Hi all,
> I started writing my GSoC 2011 project proposal for Develop a 'NoSQL'
> Datastore component for Apache Cassandra, CouchDB, Hadoop/Hbase at Tuscany
> wiki [1]. Please feel free to add some comments and do changes if needed.
> Your feedbacks are highly appreciated.
>
> thanks
> Eranda
>
> [1].
> https://cwiki.apache.org/confluence/display/TUSCANYWIKI/Develop+a+NoSQL+Datastore+component
>