You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@lucene.apache.org by "SrikanthMedisetti (via GitHub)" <gi...@apache.org> on 2023/05/02 23:15:20 UTC

[GitHub] [lucene] SrikanthMedisetti opened a new issue, #12259: Case insensitive search

SrikanthMedisetti opened a new issue, #12259:
URL: https://github.com/apache/lucene/issues/12259

   ### Description
   
   Hello Team,
   I have the following use case.
   Store/Index data without tokenize (use StringField) and maintain the same string while storing (maintain case sensitivity while storing data). The problem I face is during search. If I store field called name as 'Lucene' and if I try to search with 'lucene', I'm not able to get the store data with 'Lucene'.
   
   I'm trying to use Lucene 9.5.0 
   
   `public class LuceneExample {
       public static void main(String[] args) throws IOException, ParseException {
           // Create index
           KeywordAnalyzer analyzer = new KeywordAnalyzer();
           String indexDirectoryPath = new StringBuilder("/temp/").append("test").toString();
           // create the directory to store the index
           Directory directory = FSDirectory.open(Paths.get(indexDirectoryPath));
           IndexWriterConfig config = new IndexWriterConfig(analyzer);
           IndexWriter writer = new IndexWriter(directory, config);
   
           Document doc = new Document();
           doc.add(new StringField("name", "John Smith", Field.Store.YES));
           writer.addDocument(doc);
           writer.close();
   
           // Search index
   
           // create a reader to read the index
           IndexReader reader = DirectoryReader.open(directory);
           IndexSearcher searcher = new IndexSearcher(reader);
           QueryParser parser = new QueryParser("name", analyzer);
           Query query = parser.parse("john smith");
           TopDocs results = searcher.search(query, 10);
           for (int i = 0; i < results.scoreDocs.length; i++) {
               Document hitDoc = searcher.doc(results.scoreDocs[i].doc);
               System.out.println(hitDoc.get("name"));
           }
       }
   }`
   
   Am I missing something here? Could you please help me out.
   
   Thanks in advance!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org

[GitHub] [lucene] mkhludnev commented on issue #12259: Case insensitive search

Posted by "mkhludnev (via GitHub)" <gi...@apache.org>.

mkhludnev commented on issue #12259:
URL: https://github.com/apache/lucene/issues/12259#issuecomment-1534327232

   [SrikanthMedisetti] please use TextField instead. Not an issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org

[GitHub] [lucene] mkhludnev closed issue #12259: Case insensitive search

Posted by "mkhludnev (via GitHub)" <gi...@apache.org>.

mkhludnev closed issue #12259: Case insensitive search
URL: https://github.com/apache/lucene/issues/12259


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org

[GitHub] [lucene] MarcusSorealheis commented on issue #12259: Case insensitive search

Posted by "MarcusSorealheis (via GitHub)" <gi...@apache.org>.

MarcusSorealheis commented on issue #12259:
URL: https://github.com/apache/lucene/issues/12259#issuecomment-1533645232

   You are. The prescribe method for dealing with this sort of pre-processing is through using the analysis chain.
   
   If you add this line (and associated imports) to the top of this snippet you should be good to go:
   
   `StandardAnalyzer analyzer = new StandardAnalyzer();`
   
   Keep in mind that the standard analyzer does not process things like stop words, so the text should be largely unaffected. If you only want to affect case strictness, you can look into the [LowerCaseFilter](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/analysis/LowerCaseFilter.java) applied at the character level. Be sure to apply at index- and query-time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org