You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Weiwei Wang <ww...@gmail.com> on 2009/12/12 14:53:40 UTC

Tell me the difference

Hi, all,
     Suppose I want to index this string NashQ/c++.test and i used the
following procedure to do the processing.
NormalizeCharMap RECOVERY_MAP = new NormalizeCharMap();
RECOVERY_MAP.add("c++","cplusplus$");
CharFilter filter = new LowercaseCharFilter(reader);//LowercaseCharFilter,
see the source code at the bottome
filter = new MappingCharFilter(RECOVERY_MAP,filter);
StandardTokenizer tokenStream = new StandardTokenizer(Version.LUCENE_30,
filter);
tokenStream.setMaxTokenLength(maxTokenLength);
TokenStream result = new StandardFilter(tokenStream);
//result = new LowerCaseFilter(result);
result = new StopFilter(enableStopPositionIncrements, result, stopSet);
result = new SnowballFilter(result, STEMMER);

When i search this index with keyword nashq, nothing is returned, but if I
uncomment resut=new LowerCaseFilter(result); it will works fine?

Does anybody here know what's going on? It seems my LowercaseCharFilter has
already done this job.

//code for LowercaseCharFilter
package analysis;

import java.io.IOException;
import java.io.Reader;

import org.apache.lucene.analysis.BaseCharFilter;
import org.apache.lucene.analysis.CharReader;
import org.apache.lucene.analysis.CharStream;


public class LowercaseCharFilter extends BaseCharFilter
{

    public LowercaseCharFilter(CharStream in)
    {
    super(in);
    }

    public LowercaseCharFilter(Reader in)
    {
    super(CharReader.get(in));
    }
    @Override
    public int read() throws IOException
    {
    return Character.toLowerCase(input.read());
    }
    @Override
    public int read(char[] cbuf, int off, int len) throws IOException {
    int ret = input.read(cbuf, off, len);
    if(ret!=-1)
    {
        for(int i=off; i<off+ret; i++)
        cbuf[i] = Character.toLowerCase(cbuf[i]);
    }
    return ret;
    }
}

-- 
Weiwei Wang
Alex Wang
王巍巍
Room 403, Mengmin Wei Building
Computer Science Department
Gulou Campus of Nanjing University
Nanjing, P.R.China, 210093

Homepage: http://cs.nju.edu.cn/rl/weiweiwang