You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Enis Soztutar (JIRA)" <ji...@apache.org> on 2010/04/13 20:35:06 UTC

[jira] Created: (NUTCH-811) Develop an ORM framework

Develop an ORM framework 
-------------------------

                 Key: NUTCH-811
                 URL: https://issues.apache.org/jira/browse/NUTCH-811
             Project: Nutch
          Issue Type: New Feature
            Reporter: Enis Soztutar
            Assignee: Enis Soztutar
             Fix For: 2.0


By Nutch-808, it is clear that we need an ORM layer on top of the datastore, so that different backends can be used to store data. 

This issue will track the development of the ORM layer. Initially full support for HBase is planned, with RDBM, Hadoop MapFile and Cassandra support scheduled for later. 

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (NUTCH-811) Develop an ORM framework

Posted by "Enis Soztutar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12861010#action_12861010 ] 

Enis Soztutar commented on NUTCH-811:
-------------------------------------

I have further developed the code, which was once part of NutchBase for handling object to hbase mapping into a new project as per the above discussion. 
The project is named Gora, and it is hosted at GitHub. 

The project is hosted at
http://github.com/enis/gora

A short design document is at http://wiki.github.com/enis/gora/design, and a quick start guide is at http://wiki.github.com/enis/gora/quick-start.

You can check out the code using
$ git clone git://github.com/enis/gora.git

What it means for Nutch?
Gora started as a part of Dogacan's NutchBase implementation, but the goals for the project are clearly different. However, Gora is primarily developed to handle Nutch's use cases. Specifically, Gora will handle the HBase integration layer for nutchbase, and later a Hadoop Mapfile or TFile based persistency will be developed. 

In the short term, we plan to use Gora's artifacts as a library in Nutch. Either me or Dogacan will switch the current NutchBase branch to using Gora shortly. 


Gora is still in very early stages and needs your support. We would be more than happy if the Nutch community could share comments, feedbacks, use cases and feature requests, or even patches. I suppose we can use this issue or the mailing list for this task. 


> Develop an ORM framework 
> -------------------------
>
>                 Key: NUTCH-811
>                 URL: https://issues.apache.org/jira/browse/NUTCH-811
>             Project: Nutch
>          Issue Type: New Feature
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>             Fix For: 2.0
>
>
> By Nutch-808, it is clear that we need an ORM layer on top of the datastore, so that different backends can be used to store data. 
> This issue will track the development of the ORM layer. Initially full support for HBase is planned, with RDBM, Hadoop MapFile and Cassandra support scheduled for later. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (NUTCH-811) Develop an ORM framework

Posted by "Enis Soztutar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856546#action_12856546 ] 

Enis Soztutar commented on NUTCH-811:
-------------------------------------

Actually, we plan to develop the code for this layer in another project because, 
 - ORM layer is orthogonal to Nutch code, so it does not belong there
 - Extracting the code will be much harder later
 - If developed well, this code will be useful to other projects (interesting is that there is no API to support both HBase and Cassandra)
 - Code will be much more clean 
 - Nutch can use the artifacts from this project

Nevertheless, we plan to piggyback on the Nutch community to support the initial development, review and exposure. I will update this issue as the code develops, and will kindly ask for reviews. In the long term, we can move the project to Apache Sandbox, or as a Hadoop/Nutch sub project (once Nutch becomes TLP). 

A design document and initial code will be available shortly. 

> Develop an ORM framework 
> -------------------------
>
>                 Key: NUTCH-811
>                 URL: https://issues.apache.org/jira/browse/NUTCH-811
>             Project: Nutch
>          Issue Type: New Feature
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>             Fix For: 2.0
>
>
> By Nutch-808, it is clear that we need an ORM layer on top of the datastore, so that different backends can be used to store data. 
> This issue will track the development of the ORM layer. Initially full support for HBase is planned, with RDBM, Hadoop MapFile and Cassandra support scheduled for later. 

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (NUTCH-811) Develop an ORM framework

Posted by "Enis Soztutar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12865226#action_12865226 ] 

Enis Soztutar commented on NUTCH-811:
-------------------------------------

Hi Piet,
The code for Gora will reside in GitHub for now, since Nutch and Gora are pretty orthogonal. But as stated before, Nutch is the first user of Gora, and Gora does not yet have a separate community so I intend to always keep nutch community updated (via this issue and nutch-dev mailing list), and hope for feedback from the Nutch community.

Moreover, NutchBase has already been ported to using Gora, so at some point, Gora should be reviewed and accepted as a dependency for Nutch.

> Develop an ORM framework 
> -------------------------
>
>                 Key: NUTCH-811
>                 URL: https://issues.apache.org/jira/browse/NUTCH-811
>             Project: Nutch
>          Issue Type: New Feature
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>             Fix For: 2.0
>
>
> By Nutch-808, it is clear that we need an ORM layer on top of the datastore, so that different backends can be used to store data. 
> This issue will track the development of the ORM layer. Initially full support for HBase is planned, with RDBM, Hadoop MapFile and Cassandra support scheduled for later. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (NUTCH-811) Develop an ORM framework

Posted by "Piet Schrijver (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12864744#action_12864744 ] 

Piet Schrijver commented on NUTCH-811:
--------------------------------------

Will development for gora be tracked under this or any nutch ticket?

> Develop an ORM framework 
> -------------------------
>
>                 Key: NUTCH-811
>                 URL: https://issues.apache.org/jira/browse/NUTCH-811
>             Project: Nutch
>          Issue Type: New Feature
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>             Fix For: 2.0
>
>
> By Nutch-808, it is clear that we need an ORM layer on top of the datastore, so that different backends can be used to store data. 
> This issue will track the development of the ORM layer. Initially full support for HBase is planned, with RDBM, Hadoop MapFile and Cassandra support scheduled for later. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.