You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Ivana Spasojevic <iv...@gmail.com> on 2020/04/16 12:29:43 UTC

DelimitedBoostTokenFilterFactory Issue - Boosting and StandardTokenizerFactory

Hi there,

I’m developing custom java application with lucene 8.5.0.

I've tried to use DelimitedBoostTokenFilterFactory but I have a problem, so
please help me if I'm doing something wrong.

I’m using StandardAnalyzer for search, and my SynonymGraphFilter has
configuration as below:

Map<String, String> synonymParam = new HashMap<>();
            synonymParam.put("synonyms", synonymFileName);
            synonymParam.put("ignoreCase", "true");
            synonymParam.put("format", "solr");
            synonymParam.put("expand","true");

synonymParam.put("tokenizerFactory","org.apache.lucene.analysis.core.StandardTokenizerFactory");
Map<String, String> delimitedBoostTokenFilterMap = new HashMap<>();
delimitedBoostTokenFilterMap.put("delimiter", "|");
Analyzer customAnalyzer = CustomAnalyzer.builder(Paths.get(synonymFolder))
                    .withTokenizer(StandardTokenizerFactory.NAME)
                    .addTokenFilter(SynonymGraphFilterFactory.NAME,
synonymParam)
                    .addTokenFilter(DelimitedBoostTokenFilterFactory.NAME,
delimitedBoostTokenFilterMap)
                    .build();


Here’s my debug output:

Query:  +spanOr([spanNear([morphology_term_original_name:tumor,
morphology_term_original_name:0.8], 0, true),
spanNear([morphology_term_original_name:neoplasm,
morphology_term_original_name:0.7], 0, true),
spanNear([morphology_term_original_name:tumour,
morphology_term_original_name:0.6], 0, true)])
(spanOr([spanNear([morphology_term_pathognomonic:tumor,
morphology_term_pathognomonic:0.8], 0, true),
spanNear([morphology_term_pathognomonic:neoplasm,
morphology_term_pathognomonic:0.7], 0, true)

Thanks in advance!

Ivana