You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Priyanka Tufchi <pr...@launchship.com> on 2015/05/05 14:17:42 UTC
Finding Issue with NgramAnalyzer in Apache Lucene
Hi all
I am trying to use Apache Lucene for Ngram Separator.
Reader reader = new StringReader("This is a test string");
NGramTokenizer gramTokenizer = new NGramTokenizer(reader, 1, 3);
CharTermAttribute charTermAttribute =
gramTokenizer.addAttribute(CharTermAttribute.class);
gramTokenizer.reset();
while (gramTokenizer.incrementToken()) {
String token = charTermAttribute.toString();
System.out.println(token);
}
gramTokenizer.end();
gramTokenizer.close();
}
This is the code i used but it is returning character by character , I
want it to return in terms like this ,test , string, this test etc
===================
i tried with shringleFilter also , but it is giving nullpoint exception
*Reader reader = new StringReader("This is a test string");*
* TokenStream tokenizer = new StandardTokenizer(Version.LUCENE_41,
reader);*
* tokenizer = new ShingleFilter(tokenizer, 2, 3);*
* CharTermAttribute charTermAttribute =
tokenizer.addAttribute(CharTermAttribute.class);*
* while (tokenizer.incrementToken()) {*
* String token = charTermAttribute.toString();*
* System.out.println(token);*
* }*
Plz guide
Thanks
--
Launchship Technology respects your privacy. This email is intended only
for the use of the party to which it is addressed and may contain
information that is privileged, confidential, or protected by law. If you
have received this message in error, or do not want to receive any further
emails from us, please notify us immediately by replying to the message and
deleting it from your computer.