You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Abhishek Shivkumar <as...@in.ibm.com> on 2013/04/18 14:34:00 UTC
Why doesn't this code run - Adding synonyms from Wordnet to Lucene Index
I am writing this code as part of my CustomAnalyzer:
public class CustomAnalyzer extends Analyzer {
SynonymMap mySynonymMap = null;
CustomAnalyzer() throws IOException {
SynonymMap.Builder builder = new SynonymMap.Builder(true);
FileReader fr = new
FileReader("/home/watsonuser/Downloads/wordnetSynonyms.txt");
BufferedReader br = new BufferedReader(fr);
String line = "";
while ((line = br.readLine()) != null) {
String[] synset = line.split(",");
for(String syn: synset)
builder.add(new CharsRef(synset[0]), new CharsRef(syn),
true);
}
br.close();
fr.close();
try {
mySynonymMap = builder.build();
} catch (IOException e) {
System.out.println("Unable to build synonymMap");
e.printStackTrace();
}
}
public TokenStream tokenStream(String fieldName, Reader reader) {
TokenStream result = new PorterStemFilter(new SynonymFilter(
(new StopFilter(true,new
LowerCaseFilter
(new StandardFilter(new
StandardTokenizer
(Version.LUCENE_36,reader)
)
),StopAnalyzer.ENGLISH_STOP_WORDS_SET)), mySynonymMap, true)
);
}
}
Now, if I use the same CustomAnalyzer as part of my querying, then if I
enter the query as
myFieldName: manager
it expands the query with synonyms for manager.
But, I want the synonyms to be part of only my index and I don't want my
query to be expanded with synonyms.
So, when I removed the SynonymFilter from my CustomAnalyzer only when
querying the index, the query remains as
myFieldName: manager
but, it fails to retrieve documents that have the synonyms of manager.
How do we solve this problem?
Thanks
Abhishek S