You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "PROYECTA.Fernandez Garcia, Ivan" <pr...@iberia.es> on 2004/11/17 15:49:11 UTC

Queries Lucene 1.3

Good afternoon everybody,

	First of all thanks for your attention.

	We are using Lucene1.3 api to index and search text in pdf files.
	We have two environment to develop with it: Windows, using Apache
Tomcat 5.0 and Sun Solaris, using Oracle Aplication Server.
	First we extract text pages from pdf file using Multivalent API
(this process seems run O.K.).
	Then we search text in new index created before. At this moment we
have the following problem:
		- If pdf file number page is 10, text is found.
		- If pdf file number page is more than 10, text is not
found.
	We modify IndexWriter.minMergeDocs attribute assign two values:
Total number document pages and "1" value.
	In both cases:
		- if document is not big, index process seems run O.K. and
text search seems run O.K.
		- if document is big (600 pages), index process run K.O
raising OutofMemory exception.

	We send you our source code file where index a pdf file and search
text if you can see some error.
	We don´t know what more have we do with this problem.
	Can you help us , please?

Thanks you for your help.

 <<search_text.txt>>  <<index_lucene.txt>> 


> Iván Fernández García
> Proyecta Sistemas de Información
> 
> 
> 
> 
> 
---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.773 / Virus Database: 520 - Release Date: 05/10/2004
 

----------------------------------------------
Has decidido el mejor precio.  Has decidido IBERIA.com 
You´ve chosen the best price. You´ve chosen  IBERIA.com 
----------------------------------------------
http://www.iberia.com