You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by woolly <p....@lbs-ltd.com> on 2007/06/06 18:11:00 UTC

Storing indexes in the database

Hi all,

I currently have everything mapped to the database using the repository.xml
below. Is it possible to also store the indexes in the database? Is that a
good idea? I'd like to have a "clean" application that doesn't create files
on the file system at startup.

<?xml version="1.0"?>
<!DOCTYPE Repository PUBLIC "-//The Apache Software Foundation//DTD
Jackrabbit 1.2//EN"
                           
"http://jackrabbit.apache.org/dtd/repository-1.2.dtd">
<!-- Example Repository Configuration File -->
<Repository>
    <!--
        virtual file system where the repository stores global state
        (e.g. registered namespaces, custom node types, etc.)
    -->
    <FileSystem class="org.apache.jackrabbit.core.fs.db.OracleFileSystem">
       
       
       
       
       
       
   </FileSystem>

    <!--
        security configuration
    -->
    <Security appName="Jackrabbit">
        <!--
            access manager:
            class: FQN of class implementing the AccessManager interface
        -->
        <AccessManager
class="org.apache.jackrabbit.core.security.SimpleAccessManager">
            <!--  -->
        </AccessManager>

        <LoginModule
class="org.apache.jackrabbit.core.security.SimpleLoginModule">
           <!-- anonymous user name ('anonymous' is the default value) -->
           
           <!--
              default user name to be used instead of the anonymous user
              when no login credentials are provided (unset by default)
           -->
           <!--  -->
        </LoginModule>
    </Security>

    <!--
        location of workspaces root directory and name of default workspace
    -->
    <Workspaces rootPath="${rep.home}/workspaces"
defaultWorkspace="default"/>
    <!--
        workspace configuration template:
        used to create the initial workspace if there's no workspace yet
    -->
    <Workspace name="${wsp.name}">
        <!--
            virtual file system of the workspace:
            class: FQN of class implementing the FileSystem interface
        -->
        <FileSystem
class="org.apache.jackrabbit.core.fs.db.OracleFileSystem">
	       
	       
	       
	       
	       
	       
	   </FileSystem>
        <!--
            persistence manager of the workspace:
            class: FQN of class implementing the PersistenceManager
interface
        -->
        <PersistenceManager
class="org.apache.jackrabbit.core.persistence.db.OraclePersistenceManager">
          
	       
	       
	       
          
          
        </PersistenceManager>

        <!--
            Search index and the file system it uses.
            class: FQN of class implementing the QueryHandler interface
        -->
        <SearchIndex
class="org.apache.jackrabbit.core.query.lucene.SearchIndex">
            
        </SearchIndex>
        
    </Workspace>

    <!--
        Configures the versioning
    -->
    <Versioning rootPath="${rep.home}/version">
        <!--
            Configures the filesystem to use for versioning for the
respective
            persistence manager
        -->
        <FileSystem
class="org.apache.jackrabbit.core.fs.db.OracleFileSystem">
	       
	       
	       
	       
	       
	       
	   </FileSystem>

        <!--
            Configures the persistence manager to be used for persisting
version state.
            Please note that the current versioning implementation is based
on
            a 'normal' persistence manager, but this could change in future
            implementations.
        -->
        <PersistenceManager
class="org.apache.jackrabbit.core.persistence.db.OraclePersistenceManager">
          
	       
	       
	       
          
          
        </PersistenceManager>
    </Versioning>

    <!--
        Search index for content that is shared repository wide
        (/jcr:system tree, contains mainly versions)
    -->
    <SearchIndex
class="org.apache.jackrabbit.core.query.lucene.SearchIndex">
        
    </SearchIndex>
</Repository>

Thanks,

Phil.
-- 
View this message in context: http://www.nabble.com/Storing-indexes-in-the-database-tf3878832.html#a10991802
Sent from the Jackrabbit - Users mailing list archive at Nabble.com.


Re: Storing indexes in the database

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On 6/6/07, woolly <p....@lbs-ltd.com> wrote:
> I currently have everything mapped to the database using the repository.xml
> below. Is it possible to also store the indexes in the database? Is that a
> good idea?

Unfortunately it is not possible at the moment. Even though the
Jackrabbit configuration format does allow you to specify a
<FileSystem/> within the <SearchIndex/> configuration entry, the
current org.apache.jackrabbit.core.query.lucene.SearchIndex class
ignores such configuration and always uses the local file system for
storing the search index.

The main reason for always using the local file system is performance.
Jackrabbit uses Lucene as the query engine, and Lucene accesses its
segment files using a random access pattern. Typical databases do not
support efficient random access of blob values, which essentially
prevents any decent search performance with a database backend.

> I'd like to have a "clean" application that doesn't create files
> on the file system at startup.

This is a common theme we are hearing from many users, so I think it's
worth repeating and perhaps pushing also in the issue tracker.
However, whether we should actually support that "feature" is a tricky
question.

Architecturally Jackrabbit occupies the same layer as RDBMs systems. A
content repository would clearly be backend component in a typical
n-tier deployment scenario. This suggests that ideally Jackrabbit
shouldn't even be relying on any external databases, and should
instead handle all storage, both item persistence and search indexes,
locally within the specified repository home directory. This is in
fact the scenario that the original Jackrabbit persistence layer was
designed for, and interestingly we are currently seeing some advanced
development ideas that are going back to a similar design.

However, at the moment the only way to achieve proper ACID features in
Jackrabbit is to use either an embedded or a remote RDBMS for
persistence. Also we currently do not have a high-performance remoting
layer, and native Jackrabbit backup tools are still severely lacking.
All these issues make remote database persistence for all Jackrabbit
content very desirable and I can well understand why many people are
asking for this.

BR,

Jukka Zitting