You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Amir Kibbar <am...@tangram-soft.co.il> on 2006/01/11 16:44:13 UTC

A Database as a Lucene Index Target

Hi,

I hope that this mailing list is the right place for things like that, if
not I apologize in advance.

I've written an extension for the Directory object called DBDirectory, that
allows you to read and write a Lucene index to a database instead of a file
system.

This is done using blobs. Each blob represents a "file". Also, each blob has
a name which is equivalent to the filename and a prefix, which is equivalent
to a directory on a file system. This allows you to create multiple Lucene
indexes in a single database schema.

The solution uses two tables:
LUCENE_INDEX - which holds the index files as blobs
LUCENE_LOCK - holds the different locks

Attached is my proposed solution. This solution is still very basic, but it
does the job.
The solution supports Oracle and mysql

To use this solution:

1. Place the files:
- DBDirectory in src/java/org/apache/lucene/store
- TestDBIndex in src/test/org/apache/lucene/index
- objects-mysql.sql in src/db
- objects-oracle.sql in src/db

2. Edit the parameters for the database connection in TestDBIndex

3. Create the database tables using the objects-mysql.sql script (assuming
you're using mysql)

4. Build Lucene

5. Run TestDBIndex with the database driver in the classpath

I've tested the solution on mysql, but it *should* work on Oracle, I will
test that in a few days.

Please let me know if you think it is useful.

Amir

Re: A Database as a Lucene Index Target

Posted by Nicolas Belisle <Ni...@bibl.ulaval.ca>.
Hi,

I didn't receive the attachment. Maybe you can contribute your files to 
JIRA : http://issues.apache.org/jira/browse/LUCENE

Have you checked the compass framework: http://www.compassframework.org ?
They also developped a JDBC Directory implementation: 
http://static.compassframework.org/docs/latest/jdbcdirectory.html
Its Apache licenced.


Regards,

Nicolas


Le 10:44 2006-01-11, vous avez écrit:
>Hi,
>
>I hope that this mailing list is the right place for things like that, if 
>not I apologize in advance.
>
>I've written an extension for the Directory object called DBDirectory, 
>that allows you to read and write a Lucene index to a database instead of 
>a file system.
>
>This is done using blobs. Each blob represents a "file". Also, each blob 
>has a name which is equivalent to the filename and a prefix, which is 
>equivalent to a directory on a file system. This allows you to create 
>multiple Lucene indexes in a single database schema.
>
>The solution uses two tables:
>LUCENE_INDEX - which holds the index files as blobs
>LUCENE_LOCK - holds the different locks
>
>Attached is my proposed solution. This solution is still very basic, but 
>it does the job.
>The solution supports Oracle and mysql
>
>To use this solution:
>
>1. Place the files:
>- DBDirectory in src/java/org/apache/lucene/store
>- TestDBIndex in src/test/org/apache/lucene/index
>- objects-mysql.sql in src/db
>- objects-oracle.sql in src/db
>
>2. Edit the parameters for the database connection in TestDBIndex
>
>3. Create the database tables using the objects-mysql.sql script (assuming 
>you're using mysql)
>
>4. Build Lucene
>
>5. Run TestDBIndex with the database driver in the classpath
>
>I've tested the solution on mysql, but it *should* work on Oracle, I will 
>test that in a few days.
>
>Please let me know if you think it is useful.
>
>Amir
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>For additional commands, e-mail: java-dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


need help understanding the source...

Posted by Aditya Liviandi <ad...@i2r.a-star.edu.sg>.
Which class writes to the .frq?

I thought it's documentwriter, but the following code in the
documentwriter.java doesn't really do what fileformats says the .frq
file is like...

        int postingFreq = posting.freq;
        if (postingFreq == 1)				  // optimize
freq=1
          freq.writeVInt(1);			  // set low bit of doc
num.
        else {
          freq.writeVInt(0);			  // the document number
          freq.writeVInt(postingFreq);			  // frequency
in doc
        }

the code in segmentmerger.java seems to do the right one...

        int freq = postings.freq();
        if (freq == 1) {
          freqOutput.writeVInt(docCode | 1);	  // write doc & freq=1
        } else {
          freqOutput.writeVInt(docCode);	  // write doc
          freqOutput.writeVInt(freq);		  // write frequency in
doc
        }

so then, what does the first portion (the one from documentwriter.java)
really do?

Why does it write 0 when postingFreq is more than one?


--------------------------------------------------
This email is confidential and may be privileged.  If you are not the intended recipient, please delete it and notify us immediately. Please do not copy or use it for any purpose, or disclose its contents to any other person. Thank you.
--------------------------------------------------

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


RE: A Database as a Lucene Index Target

Posted by David Freireich <da...@coresearchinc.com>.
Amir:
 
I'm interested in the type of work you are doing with Lucene.  
 
I own an extremely focused talent agency, Core Search Group. We
represent the top infrastructure software engineers in the world to
companies that must have the best talent possible to realize their
goals.  
 
Most of our clients are in the US.  Do you have any desire to work in
the states at some point in the future?  Your email address indicates
you are located in Israel now. 
 
Thanks for your time.
 
DAVE
 
Dave Freireich
Core Search Group
Where software talent is number one.
803-771-4289 x12
www.coresearchinc.com
 
-----Original Message-----
From: amir.kibbar@gmail.com [mailto:amir.kibbar@gmail.com] On Behalf Of
Amir Kibbar
Sent: Wednesday, January 11, 2006 10:44 AM
To: java-dev@lucene.apache.org
Subject: A Database as a Lucene Index Target
 
Hi,

I hope that this mailing list is the right place for things like that,
if not I apologize in advance.

I've written an extension for the Directory object called DBDirectory,
that allows you to read and write a Lucene index to a database instead
of a file system. 

This is done using blobs. Each blob represents a "file". Also, each blob
has a name which is equivalent to the filename and a prefix, which is
equivalent to a directory on a file system. This allows you to create
multiple Lucene indexes in a single database schema. 

The solution uses two tables:
LUCENE_INDEX - which holds the index files as blobs
LUCENE_LOCK - holds the different locks

Attached is my proposed solution. This solution is still very basic, but
it does the job. 
The solution supports Oracle and mysql

To use this solution:

1. Place the files:
- DBDirectory in src/java/org/apache/lucene/store
- TestDBIndex in src/test/org/apache/lucene/index
- objects-mysql.sql in src/db
- objects-oracle.sql in src/db

2. Edit the parameters for the database connection in TestDBIndex

3. Create the database tables using the objects-mysql.sql script
(assuming you're using mysql)

4. Build Lucene

5. Run TestDBIndex with the database driver in the classpath

I've tested the solution on mysql, but it *should* work on Oracle, I
will test that in a few days.

Please let me know if you think it is useful. 

Amir