You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Ankit Murarka <an...@rancoretech.com> on 2013/08/02 07:39:43 UTC

Re: Did you Mean search on Indexes created by Different Files. -Completed.

This may sound bit akward but I am now able to implement Did you mean 
search on Indexes.

The only help came from the "Lucene In Action" book.

It occured to me that Once I create index of my documents, I need to 
pass these indexes to SpellCheck to create his own Index.(in a new 
directory obviously).

Then I gave this new directory path to the spellChecker to search and it 
gave me what I wanted. Word Suggestions from the documents I supplied as 
an input.

Hopefully someone may find it useful..

On 8/1/2013 10:44 AM, Ankit Murarka wrote:
> Can anyone please guide me on  how to implement Did You Mean Search 
> using indexes created from the supplied bunch of files as an input.
>
> On 7/31/2013 11:15 AM, Ankit Murarka wrote:
>> Any help on this will be highly appreciated..I have been trying all 
>> possible different option but to no avail.
>>
>> Also tried LuceneDictionary BUT THIS ALSO DOES NOT SEEM TO BE HELPING...
>>
>> Please guide.
>>
>> On 7/30/2013 4:49 PM, Ankit Murarka wrote:
>>> Hello.
>>>
>>> Using DirectSpellChecker is not serving my purpose. This seems to 
>>> return word suggestions from a dictionary whereas I wish to return 
>>> search suggestion from Indexes I created supplying my own Files 
>>> (These files are generally log files).
>>>
>>> I created indexes for certain files in D:\\Indexes and the field 
>>> name is "content"
>>>
>>> Then I used DirectSpellChecker and provided IndexReader argument to 
>>> it. Invoked SuggestSimilar function and SuggestWords array as the 
>>> output. Iterated over the array .
>>>
>>> I seem to get suggested words from the dictionary and not from the 
>>> indexes.
>>>
>>> Code Snippet for the search part:
>>>
>>> String index="D:\\Indexes";
>>> String field = "contents";
>>> IndexReader reader = DirectoryReader.open(FSDirectory.open(new 
>>> File(index)));
>>> DirectSpellChecker dsc=new DirectSpellChecker();
>>> Term term1=new Term(field, "Amrih");
>>> SuggestWord[] suggestWord=dsc.suggestSimilar(term1, 10, reader);
>>> if(suggestWord!=null && suggestWord.length>0)
>>>         {
>>>             for(SuggestWord word:suggestWord)
>>>             {
>>>                 System.out.println("Did you Mean  "  + word.string );
>>>             }
>>>
>>>         }
>>>         else
>>>         {
>>>             System.out.println("No Suggestions found");
>>>         }
>>>
>>>
>>> Please guide. Basically the suggested words should be provided from 
>>> the indexes I have created.. It should not come from any 
>>> dictionary.. Is it possible ?
>>>
>>>
>>> On 7/29/2013 9:34 PM, Varun Thacker wrote:
>>>> Hi,
>>>>
>>>>
>>>> On Mon, Jul 29, 2013 at 4:36 PM, Ankit Murarka<
>>>> ankit.murarka@rancoretech.com>  wrote:
>>>>
>>>>> Since I am new to this, I can't stop exploring it and trying to use
>>>>> different features.
>>>>>
>>>>> I am now trying to implement "Did you Mean " search using 
>>>>> SpellChecker jar
>>>>> and Lucene jar.
>>>>>
>>>>> The problem I faced are plenty although I have got it working..
>>>>>
>>>>> code snippet:
>>>>>
>>>>> File dir = new File("D:\\Inde\\");
>>>>> Directory directory = FSDirectory.open(dir);
>>>>> SpellChecker spellChecker = new SpellChecker(directory);
>>>>> String wordForSuggestions = "aski";
>>>>> Analyzer analyzer=new 
>>>>> CustomAnalyzerForCaseSensitive**(Version.LUCENE_43);
>>>>>   //This analyzer only has commented LowerCaseFilter.
>>>>> IndexWriterConfig iwc = new IndexWriterConfig(Version.**LUCENE_43,
>>>>> analyzer);
>>>>> IndexWriter writer = new IndexWriter(directory, iwc);
>>>>> File file1=new File("D:\\Inde\\wordlist.txt")**;
>>>>> indexDocs(writer,file1);
>>>>> writer.close();
>>>>> spellChecker.indexDictionary(
>>>>> new PlainTextDictionary(new File("D:\\Inde\\wordlist.txt")**), iwc,
>>>>> false);
>>>>> int suggestionsNumber = 10;
>>>>> String[] suggestions = spellChecker.
>>>>> suggestSimilar(**wordForSuggestions, suggestionsNumber);
>>>>> if (suggestions!=null&&  suggestions.length>0) {
>>>>>
>>>>>              for (String word : suggestions) {
>>>>>
>>>>>                  System.out.println("Did you mean:" + word + "");
>>>>>
>>>>>              }
>>>>>
>>>>>          }
>>>>> else {
>>>>>
>>>>>              System.out.println("No suggestions found for
>>>>> word:"+wordForSuggestions);
>>>>>
>>>>>          }
>>>>>
>>>>> The code works fine. It suggest me 10 possible matches.
>>>>> Problem is here I am creating/updating indexes everytime.
>>>>>
>>>>> Say suppose I have 1000 log files and these files are indexed in
>>>>> D:\\LogIndexes. Instead of reading a standard dictionary and 
>>>>> building up
>>>>> indexes, I wish to use these indexes to suggest me possible match..
>>>>>
>>>>> Is it possible to do?. If yes, what can be the approach. Please 
>>>>> provide
>>>>> some assistance.
>>>>>
>>>>
>>>>   Check out DirectSpellChecker (
>>>> http://lucene.apache.org/core/4_3_1/suggest/org/apache/lucene/search/spell/DirectSpellChecker.html 
>>>>
>>>>   )
>>>>
>>>> Using DirectSpellChecker you do not need to build a separate spell 
>>>> index,
>>>> instead using the actual index for spell suggestions.
>>>>
>>>>
>>>>
>>>>> Next question would be to suggest a phrase. If I enter "Head ach 
>>>>> heav" ,
>>>>> then I should get "Head ache heavy" as one possible suggestion. 
>>>>> haven't
>>>>> tried it yet but surely will be an absolute beauty to have it..
>>>>>
>>>> DirectSpellChecker works on a term so there is no feature which 
>>>> will give
>>>> you suggestions on a phrase out of the box.
>>>>
>>>> You might want to take each term of the query and check for spell 
>>>> mistakes,
>>>> and then combine them back again. You could look up the code from
>>>> Solr.SpellCheckComponent.addCollationsToResponse
>>>>
>>>> http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.collate
>>>>
>>>> http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/java/org/apache/solr/handler/component/SpellCheckComponent.java 
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>> Also examples available on net for "Did you mean" are very very 
>>>>> old and
>>>>> API have undergone significant changes thus making them not so 
>>>>> very useful.
>>>>>
>>>>>
>>>>> -- 
>>>>> Regards
>>>>>
>>>>> Ankit Murarka
>>>>>
>>>>> "Peace is found not in what surrounds us, but in what we hold 
>>>>> within."
>>>>>
>>>>>
>>>>> ------------------------------**------------------------------**--------- 
>>>>>
>>>>> To unsubscribe, e-mail: 
>>>>> java-user-unsubscribe@lucene.**apache.org<ja...@lucene.apache.org> 
>>>>>
>>>>> For additional commands, e-mail: 
>>>>> java-user-help@lucene.apache.**org<ja...@lucene.apache.org>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>>
>
>


-- 
Regards

Ankit Murarka

"What lies behind us and what lies before us are tiny matters compared with what lies within us"


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org