You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Christian Rodriguez <cr...@gmail.com> on 2004/09/21 00:36:51 UTC

Problems with Lucene + BDB (Berkeley DB) integration

Hi everyone, 

I am trying to use the Lucene + BDB integration from the sandbox
(http://cvs.apache.org/viewcvs.cgi/jakarta-lucene-sandbox/contributions/db/).
I installed C Berkeley DB 4.2.52 and I have the Lucene jar file.

I have an example program that indexes 4 small text files in a
directory (its very similar to the IndexFiles.java in the Lucene demo,
except that it uses BDB + Lucene). The problem I have is that
executing the indexing program generates different results each time I
run it. For example: If I start with an empty index, run the indexing
program and then query the index I get the correct results; then I
delete the index to start from scratch again, and perform the same
sequence and I get no results. (?)

What puzzles me is the non-deterministic results... the same execution
sequence generates two different results. I then wrote a program to
dump the index and I found out that the list of files that end up in
the index is different every time I index those 4 files.

For example:
1st run: contents of directory: _4.f2, _4.f3, _4.cfs, _4.fdx, _4.fnm,
_4.frq, _4.prx, _4.tii, segments, deletable. (9 files)
2nd run: contents of directory: 0:_4.f1, _4.cfs, _4.fdt, _4.fdx,
_4.fnm, _4.frq, _4.prx, _4.tii, _4.tis, segments, deletable. (11
files)

Does anyone have any idea why this is happening?
Has anyone been able to use the BDB + Lucene integration with no problems?

Id appreciate any help or pointers.
Thanks!
Xtian

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Problems with Lucene + BDB (Berkeley DB) integration

Posted by Christian Rodriguez <cr...@gmail.com>.
Andy, you are right. 

I tried with Lucene 1.3 and it worked perfectly. This should be added
to a README in the Lucene + BDB sandbox (or somewhere) so people dont
spend days struggling with those weird non - deterministic bugs I am
getting...

Now, I do need to use version 1.4, so Id like to see if its actually
the cluster file system causing the problem in the integration with
Berkeley DB. Is it possible to turn the cluster file system OFF in
Lucene 1.4?

Anyone has any idea what other things could be causing BDB + Lucene to
work with Lucene 1.3 and not with 1.4?

Thanks!
Xtian

On Mon, 20 Sep 2004 18:13:46 -0700, Andy Goodell <go...@gmail.com> wrote:
> I used BDB + lucene successfully using the lucene 1.3 distribution,
> but it broke in my application with the 1.4 distribution.  The 1.4
> dist uses a different file system by default, the "cluster file
> system", so maybe that is the source of the issues.
> 
> good luck,
> andy g
> 
> 
> 
> 
> On Mon, 20 Sep 2004 19:36:51 -0300, Christian Rodriguez
> <cr...@gmail.com> wrote:
> > Hi everyone,
> >
> > I am trying to use the Lucene + BDB integration from the sandbox
> > (http://cvs.apache.org/viewcvs.cgi/jakarta-lucene-sandbox/contributions/db/).
> > I installed C Berkeley DB 4.2.52 and I have the Lucene jar file.
> >
> > I have an example program that indexes 4 small text files in a
> > directory (its very similar to the IndexFiles.java in the Lucene demo,
> > except that it uses BDB + Lucene). The problem I have is that
> > executing the indexing program generates different results each time I
> > run it. For example: If I start with an empty index, run the indexing
> > program and then query the index I get the correct results; then I
> > delete the index to start from scratch again, and perform the same
> > sequence and I get no results. (?)
> >
> > What puzzles me is the non-deterministic results... the same execution
> > sequence generates two different results. I then wrote a program to
> > dump the index and I found out that the list of files that end up in
> > the index is different every time I index those 4 files.
> >
> > For example:
> > 1st run: contents of directory: _4.f2, _4.f3, _4.cfs, _4.fdx, _4.fnm,
> > _4.frq, _4.prx, _4.tii, segments, deletable. (9 files)
> > 2nd run: contents of directory: 0:_4.f1, _4.cfs, _4.fdt, _4.fdx,
> > _4.fnm, _4.frq, _4.prx, _4.tii, _4.tis, segments, deletable. (11
> > files)
> >
> > Does anyone have any idea why this is happening?
> > Has anyone been able to use the BDB + Lucene integration with no problems?
> >
> > Id appreciate any help or pointers.
> > Thanks!
> > Xtian
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> >
> >
>

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Problems with Lucene + BDB (Berkeley DB) integration

Posted by Andy Goodell <go...@gmail.com>.
I used BDB + lucene successfully using the lucene 1.3 distribution,
but it broke in my application with the 1.4 distribution.  The 1.4
dist uses a different file system by default, the "cluster file
system", so maybe that is the source of the issues.

good luck,
andy g


On Mon, 20 Sep 2004 19:36:51 -0300, Christian Rodriguez
<cr...@gmail.com> wrote:
> Hi everyone,
> 
> I am trying to use the Lucene + BDB integration from the sandbox
> (http://cvs.apache.org/viewcvs.cgi/jakarta-lucene-sandbox/contributions/db/).
> I installed C Berkeley DB 4.2.52 and I have the Lucene jar file.
> 
> I have an example program that indexes 4 small text files in a
> directory (its very similar to the IndexFiles.java in the Lucene demo,
> except that it uses BDB + Lucene). The problem I have is that
> executing the indexing program generates different results each time I
> run it. For example: If I start with an empty index, run the indexing
> program and then query the index I get the correct results; then I
> delete the index to start from scratch again, and perform the same
> sequence and I get no results. (?)
> 
> What puzzles me is the non-deterministic results... the same execution
> sequence generates two different results. I then wrote a program to
> dump the index and I found out that the list of files that end up in
> the index is different every time I index those 4 files.
> 
> For example:
> 1st run: contents of directory: _4.f2, _4.f3, _4.cfs, _4.fdx, _4.fnm,
> _4.frq, _4.prx, _4.tii, segments, deletable. (9 files)
> 2nd run: contents of directory: 0:_4.f1, _4.cfs, _4.fdt, _4.fdx,
> _4.fnm, _4.frq, _4.prx, _4.tii, _4.tis, segments, deletable. (11
> files)
> 
> Does anyone have any idea why this is happening?
> Has anyone been able to use the BDB + Lucene integration with no problems?
> 
> Id appreciate any help or pointers.
> Thanks!
> Xtian
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
>

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org