You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@gora.apache.org by Marcel Offermans <ma...@luminis.nl> on 2011/12/18 18:54:53 UTC

When to use Gora over a traditional ORM for SQL databases?

When discussing the Gora graduation resolution, Chris encouraged me to take this question to the gora-dev list, so I'm quoting this part "out of context" here, and would be interested in learning a bit more about this.

As a bit of background on me, I'm working on an open source project called Amdatu, which is very much using the "Apache way" from the start, but we're still building a community and codebase and hopefully somewhere in 2012 we are mature enough to migrate at least some parts of our codebase to the Incubator. Part of what we're doing can be classified as "big data in the cloud" so anything that works with for example Cassandra is of interest to me.

Back to the question:

>> The background of the project does state that it has "limited support for SQL databases" and that it "ignores complex SQL mappings" so just out of interest, when would you use Gora over for example JDO (or JPA or Hibernate) when using a SQL database?
> 
> I think this might be a good thread over on gora-dev if you are interested. We'd be happy
> to answer it there.

Some insights would be very welcome! :)

Greetings, Marcel

Re: When to use Gora over a traditional ORM for SQL databases?

Posted by Enis Söztutar <en...@gmail.com>.

Hi,

Very good question indeed. The short answer is that you will not want to
use GORA over JDO, JPA, Hibernate, etc if you want to use a SQL-relational
database. Gora does not try to reinvent the wheel in that respect.

Now, coming to the long answer, let me quote the home page:

Although there are various excellent ORM frameworks for relational
databases, data modeling in NoSQL data stores differ profoundly from their
relational cousins. Moreover, data-model agnostic frameworks such as JDO
are not sufficient for use cases, where one needs to use the full power of
the data models in column stores. Gora fills this gap by giving the user an
easy-to-use ORM framework with data store specific mappings and built in
Apache Hadoop support.

The overall goal for Gora is to become the standard data representation and
persistence framework for big data. The roadmap of Gora can be grouped as
follows.

   - *Data Persistence :* Persisting objects to Column stores such as
   HBase, Cassandra, Hypertable; key-value stores such as Voldermort, Redis,
   etc; SQL databases, such as MySQL, HSQLDB, flat files in local file system
   of Hadoop HDFS.
   - *Data Access :* An easy to use Java-friendly common API for accessing
   the data regardless of its location.
   - *Indexing :* Persisting objects to Lucene and Solr indexes,
   accessing/querying the data with Gora API.
   - *Analysis :* Accesing the data and making analysis through adapters
   for Apache Pig, Apache Hive and Cascading
   - *MapReduce support :* Out-of-the-box and extensive MapReduce (Apache
   Hadoop) support for data in the data store.

So, the main aim of Gora is to develop a common in memory data
representation for column stores, key value stores, and document stores.
However, since every NoSQL store has different storage arrangements and
different data models, Gora tries to not restrict the data layout by the
data store. For example, an array in a data bean, can be serialized as
either
 - columfamily:0 -> arr[0]  or
- columnfamily:arra[0] -> null

Further, with different Gora modules, we will be able to use the same data
bean to read a row from cassandra, store it in memcache, and index it in
solr.

Gora has limited support for SQL, since for the initial client of Gora,
Apache Nutch version 2.0, we wanted to give a zero conf setup using HSQLDB.
We only support single column primary keys for example. However, with the
current implementation, we can use Nutch 2.0 to crawl the web and store the
data in HBase, Cassandra, or Mysql.

Hope that cleared up the space a bit. Let me know if you have further
questions. As an wanna-be-graduate of incubator, we really need more
interest and involvement from the community, so any kind of discussion,
questions, patches, etc are more than welcome.

Cheers,
Enis

On Sun, Dec 18, 2011 at 9:54 AM, Marcel Offermans <
marcel.offermans@luminis.nl> wrote:

> When discussing the Gora graduation resolution, Chris encouraged me to
> take this question to the gora-dev list, so I'm quoting this part "out of
> context" here, and would be interested in learning a bit more about this.
>
> As a bit of background on me, I'm working on an open source project called
> Amdatu, which is very much using the "Apache way" from the start, but we're
> still building a community and codebase and hopefully somewhere in 2012 we
> are mature enough to migrate at least some parts of our codebase to the
> Incubator. Part of what we're doing can be classified as "big data in the
> cloud" so anything that works with for example Cassandra is of interest to
> me.
>
> Back to the question:
>
> >> The background of the project does state that it has "limited support
> for SQL databases" and that it "ignores complex SQL mappings" so just out
> of interest, when would you use Gora over for example JDO (or JPA or
> Hibernate) when using a SQL database?
> >
> > I think this might be a good thread over on gora-dev if you are
> interested. We'd be happy
> > to answer it there.
>
> Some insights would be very welcome! :)
>
> Greetings, Marcel
>
>