You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tuscany.apache.org by Eranda Sooriyabandara <07...@gmail.com> on 2011/07/10 20:15:25 UTC

[Progress Report] Develop a 'NoSQL' Datastore component for Apache Cassandra, CouchDB, Hadoop/Hbase

Hi all,
This mail describe my works till the mid term evaluation.
In this project my ultimate goal is to create a SCA portable datastore
components over Apache Cassandra, Apache CouchDB and Apacha Hadoop/HBase.

According to the architecture of the datastore component discussed in [1],
first I had to identified a common interface which can be use for all the
three NoSQL databases. I came up with the following concepts which I can use
to give basic CRUD operation to that databases.

1. Session

Session basically do the hard work to connect with the database. Also this
manage the sessions with the database.
In a session user can create Database, get an existing Database or delete a
Database.

2. Database

Database is collection of data Groups. Database also can describe as a
tipical database found in SQL or NoSQL DBMSs.
In Database user can create Group, get an existing Group or delete a Group.

3. Group

Group is a data set which can be catagorized in to the same catagory.
Currently group can have String datatype entries only.

In a Group user can add entry, get entry, modify entry and delete entry.
In creating the Group I used the different concepts of the three databases.


   - Apache Cassandra I used a column family as a group
   - Apache CouchDB I used a Document as a group
   - Apache HBase I used a column as a group


I Added some exceptions which thrown in different issues which describe the
error. The list of the issues are,


   - DatabaseNotFoundException - This exception thrown when deleting a not
   existing Database
   - GroupNotFoundException -  This exception thrown when deleting a not
   existing  Group
   - DuplicateEntryException - This exception thrown when add an entry with
   a key which already exist in the Group
   - EntryNotFoundException - This exception thrown when read or update a
   entry which is not exist in a Group
   - SessionException - Any exception which occurred when handling the
   Session

Next iteration:

   - Create SCA components for the Apache Cassandra, Apache CouchDB and
   Apacha Hadoop/HBase programs.
   - Testing the components
   - Document the project
   - Write a tutorial on how these components work
   - Update the poms such that they can inherit from the Tuscany parent pom


I created a test case which run using JUnit to check the basic CRUD
operations. Since all the databases using the same API the test case for all
the databases are the same.

For my works lots of other communities were helped me with various isssues
and need to thankful to them as well.

   1. Apache Cassandra community
   2. Apache CouchDB community
   3. Apache HBase community
   4. Hector community (high level client for Apache Cassandra)
   5. jcouchdb community (high level client for Apache CouchDB)


Thanks for the Tuscany community for all the helps, specially
Jean-Sebastian.
You can find my code in [2] and you can suggest your ideas to make this
project a success.

thanks
Eranda Sooriyabandara

[1].
https://cwiki.apache.org/confluence/display/TUSCANYWIKI/Develop+a+NoSQL+Datastore+component
[2].
https://svn.apache.org/repos/asf/tuscany/collaboration/GSoC-2011-Eranda/