You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Humberto Rocha <hu...@gmail.com> on 2015/03/25 12:43:59 UTC

Problems with Lucene and BrazilianAnalyzer (lucene-core-4.9.0.jar and lucene-analyzers-common-4.9.0.jar): Search returning more results that the desired

Hi

I'm indexing 4 .txt files using:
-Lucene (lucene-core-4.9.0.jar)
-BrazilianAnalyzer (lucene-analyzers-common-4.9.0.jar)

The files have the following content:
- File A: tecnológico
- File B: tecnologico
- File C: tecnologias
- File D: tecnolo

For the search used as well:
- Lucene (lucene-core-4.9.0.jar)
- BrazilianAnalyzer (lucene-analyzers-common-4.9.0.jar)

Using the parameter "tecnologico" get the following search result:
- File A: tecnológico
- File B: tecnologico
- File C: tecnologias

I tried the same search on the same indexes by Luke and the same results
are presented.

My question: is that correct?

Shouldn't receive only:
- File A: tecnológico
- File B: tecnologico

Why?

Is there any way to make this result stay this way?

In this context for example, I would like to receive:
- File A: tecnológico
- File B: tecnologico

-Follow code snippet used for indexing:

    . . .
    BrazilianAnalyzer analyzer = new BrazilianAnalyzer(Version.LUCENE_4_9);
    IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_4_9,
analyzer);
    Directory d = new SimpleFSDirectory(indexDir);
    writer = new IndexWriter(d, config);
    Document doc = new Document();
    doc.add(new TextField("filename", file.getAbsolutePath(),
Field.Store.YES));
    doc.add(new TextField("contents",
getTika().parseToString(f),Field.Store.YES));
    writer.addDocument(doc);
    . . .

- Follow code snippet used to search:

    . . .
    Directory diretorio = new SimpleFSDirectory(new
File(localizacaoIndicesLucene));
    IndexReader leitor = DirectoryReader.open(d);
    IndexSearcher buscador = new IndexSearcher(leitor);
    BrazilianAnalyzer analisador = new
BrazilianAnalyzer(Version.LUCENE_4_9);
    QueryParser parser = new QueryParser(Version.LUCENE_4_9,
"contents",analisador);
    Query query = parser.parse(parametro);
    TopDocs resultado = buscador.search(query, 10);
    . . .


I appreciate the help!

Thanks a lot!

Humberto