You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Ankit Murarka <an...@rancoretech.com> on 2013/08/01 07:14:19 UTC

Re: Did you Mean search on Indexes created by Different Files.

Can anyone please guide me on  how to implement Did You Mean Search 
using indexes created from the supplied bunch of files as an input.

On 7/31/2013 11:15 AM, Ankit Murarka wrote:
> Any help on this will be highly appreciated..I have been trying all 
> possible different option but to no avail.
>
> Also tried LuceneDictionary BUT THIS ALSO DOES NOT SEEM TO BE HELPING...
>
> Please guide.
>
> On 7/30/2013 4:49 PM, Ankit Murarka wrote:
>> Hello.
>>
>> Using DirectSpellChecker is not serving my purpose. This seems to 
>> return word suggestions from a dictionary whereas I wish to return 
>> search suggestion from Indexes I created supplying my own Files 
>> (These files are generally log files).
>>
>> I created indexes for certain files in D:\\Indexes and the field name 
>> is "content"
>>
>> Then I used DirectSpellChecker and provided IndexReader argument to 
>> it. Invoked SuggestSimilar function and SuggestWords array as the 
>> output. Iterated over the array .
>>
>> I seem to get suggested words from the dictionary and not from the 
>> indexes.
>>
>> Code Snippet for the search part:
>>
>> String index="D:\\Indexes";
>> String field = "contents";
>> IndexReader reader = DirectoryReader.open(FSDirectory.open(new 
>> File(index)));
>> DirectSpellChecker dsc=new DirectSpellChecker();
>> Term term1=new Term(field, "Amrih");
>> SuggestWord[] suggestWord=dsc.suggestSimilar(term1, 10, reader);
>> if(suggestWord!=null && suggestWord.length>0)
>>         {
>>             for(SuggestWord word:suggestWord)
>>             {
>>                 System.out.println("Did you Mean  "  + word.string );
>>             }
>>
>>         }
>>         else
>>         {
>>             System.out.println("No Suggestions found");
>>         }
>>
>>
>> Please guide. Basically the suggested words should be provided from 
>> the indexes I have created.. It should not come from any dictionary.. 
>> Is it possible ?
>>
>>
>> On 7/29/2013 9:34 PM, Varun Thacker wrote:
>>> Hi,
>>>
>>>
>>> On Mon, Jul 29, 2013 at 4:36 PM, Ankit Murarka<
>>> ankit.murarka@rancoretech.com>  wrote:
>>>
>>>> Since I am new to this, I can't stop exploring it and trying to use
>>>> different features.
>>>>
>>>> I am now trying to implement "Did you Mean " search using 
>>>> SpellChecker jar
>>>> and Lucene jar.
>>>>
>>>> The problem I faced are plenty although I have got it working..
>>>>
>>>> code snippet:
>>>>
>>>> File dir = new File("D:\\Inde\\");
>>>> Directory directory = FSDirectory.open(dir);
>>>> SpellChecker spellChecker = new SpellChecker(directory);
>>>> String wordForSuggestions = "aski";
>>>> Analyzer analyzer=new 
>>>> CustomAnalyzerForCaseSensitive**(Version.LUCENE_43);
>>>>   //This analyzer only has commented LowerCaseFilter.
>>>> IndexWriterConfig iwc = new IndexWriterConfig(Version.**LUCENE_43,
>>>> analyzer);
>>>> IndexWriter writer = new IndexWriter(directory, iwc);
>>>> File file1=new File("D:\\Inde\\wordlist.txt")**;
>>>> indexDocs(writer,file1);
>>>> writer.close();
>>>> spellChecker.indexDictionary(
>>>> new PlainTextDictionary(new File("D:\\Inde\\wordlist.txt")**), iwc,
>>>> false);
>>>> int suggestionsNumber = 10;
>>>> String[] suggestions = spellChecker.
>>>> suggestSimilar(**wordForSuggestions, suggestionsNumber);
>>>> if (suggestions!=null&&  suggestions.length>0) {
>>>>
>>>>              for (String word : suggestions) {
>>>>
>>>>                  System.out.println("Did you mean:" + word + "");
>>>>
>>>>              }
>>>>
>>>>          }
>>>> else {
>>>>
>>>>              System.out.println("No suggestions found for
>>>> word:"+wordForSuggestions);
>>>>
>>>>          }
>>>>
>>>> The code works fine. It suggest me 10 possible matches.
>>>> Problem is here I am creating/updating indexes everytime.
>>>>
>>>> Say suppose I have 1000 log files and these files are indexed in
>>>> D:\\LogIndexes. Instead of reading a standard dictionary and 
>>>> building up
>>>> indexes, I wish to use these indexes to suggest me possible match..
>>>>
>>>> Is it possible to do?. If yes, what can be the approach. Please 
>>>> provide
>>>> some assistance.
>>>>
>>>
>>>   Check out DirectSpellChecker (
>>> http://lucene.apache.org/core/4_3_1/suggest/org/apache/lucene/search/spell/DirectSpellChecker.html 
>>>
>>>   )
>>>
>>> Using DirectSpellChecker you do not need to build a separate spell 
>>> index,
>>> instead using the actual index for spell suggestions.
>>>
>>>
>>>
>>>> Next question would be to suggest a phrase. If I enter "Head ach 
>>>> heav" ,
>>>> then I should get "Head ache heavy" as one possible suggestion. 
>>>> haven't
>>>> tried it yet but surely will be an absolute beauty to have it..
>>>>
>>> DirectSpellChecker works on a term so there is no feature which will 
>>> give
>>> you suggestions on a phrase out of the box.
>>>
>>> You might want to take each term of the query and check for spell 
>>> mistakes,
>>> and then combine them back again. You could look up the code from
>>> Solr.SpellCheckComponent.addCollationsToResponse
>>>
>>> http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.collate
>>>
>>> http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/java/org/apache/solr/handler/component/SpellCheckComponent.java 
>>>
>>>
>>>
>>>
>>>
>>>> Also examples available on net for "Did you mean" are very very old 
>>>> and
>>>> API have undergone significant changes thus making them not so very 
>>>> useful.
>>>>
>>>>
>>>> -- 
>>>> Regards
>>>>
>>>> Ankit Murarka
>>>>
>>>> "Peace is found not in what surrounds us, but in what we hold within."
>>>>
>>>>
>>>> ------------------------------**------------------------------**--------- 
>>>>
>>>> To unsubscribe, e-mail: 
>>>> java-user-unsubscribe@lucene.**apache.org<ja...@lucene.apache.org> 
>>>>
>>>> For additional commands, e-mail: 
>>>> java-user-help@lucene.apache.**org<ja...@lucene.apache.org>
>>>>
>>>>
>>>
>>
>>
>
>


-- 
Regards

Ankit Murarka

"What lies behind us and what lies before us are tiny matters compared with what lies within us"


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Did you Mean search on Indexes created by Different Files. -Completed.

Posted by Ankit Murarka <an...@rancoretech.com>.
This may sound bit akward but I am now able to implement Did you mean 
search on Indexes.

The only help came from the "Lucene In Action" book.

It occured to me that Once I create index of my documents, I need to 
pass these indexes to SpellCheck to create his own Index.(in a new 
directory obviously).

Then I gave this new directory path to the spellChecker to search and it 
gave me what I wanted. Word Suggestions from the documents I supplied as 
an input.

Hopefully someone may find it useful..

On 8/1/2013 10:44 AM, Ankit Murarka wrote:
> Can anyone please guide me on  how to implement Did You Mean Search 
> using indexes created from the supplied bunch of files as an input.
>
> On 7/31/2013 11:15 AM, Ankit Murarka wrote:
>> Any help on this will be highly appreciated..I have been trying all 
>> possible different option but to no avail.
>>
>> Also tried LuceneDictionary BUT THIS ALSO DOES NOT SEEM TO BE HELPING...
>>
>> Please guide.
>>
>> On 7/30/2013 4:49 PM, Ankit Murarka wrote:
>>> Hello.
>>>
>>> Using DirectSpellChecker is not serving my purpose. This seems to 
>>> return word suggestions from a dictionary whereas I wish to return 
>>> search suggestion from Indexes I created supplying my own Files 
>>> (These files are generally log files).
>>>
>>> I created indexes for certain files in D:\\Indexes and the field 
>>> name is "content"
>>>
>>> Then I used DirectSpellChecker and provided IndexReader argument to 
>>> it. Invoked SuggestSimilar function and SuggestWords array as the 
>>> output. Iterated over the array .
>>>
>>> I seem to get suggested words from the dictionary and not from the 
>>> indexes.
>>>
>>> Code Snippet for the search part:
>>>
>>> String index="D:\\Indexes";
>>> String field = "contents";
>>> IndexReader reader = DirectoryReader.open(FSDirectory.open(new 
>>> File(index)));
>>> DirectSpellChecker dsc=new DirectSpellChecker();
>>> Term term1=new Term(field, "Amrih");
>>> SuggestWord[] suggestWord=dsc.suggestSimilar(term1, 10, reader);
>>> if(suggestWord!=null && suggestWord.length>0)
>>>         {
>>>             for(SuggestWord word:suggestWord)
>>>             {
>>>                 System.out.println("Did you Mean  "  + word.string );
>>>             }
>>>
>>>         }
>>>         else
>>>         {
>>>             System.out.println("No Suggestions found");
>>>         }
>>>
>>>
>>> Please guide. Basically the suggested words should be provided from 
>>> the indexes I have created.. It should not come from any 
>>> dictionary.. Is it possible ?
>>>
>>>
>>> On 7/29/2013 9:34 PM, Varun Thacker wrote:
>>>> Hi,
>>>>
>>>>
>>>> On Mon, Jul 29, 2013 at 4:36 PM, Ankit Murarka<
>>>> ankit.murarka@rancoretech.com>  wrote:
>>>>
>>>>> Since I am new to this, I can't stop exploring it and trying to use
>>>>> different features.
>>>>>
>>>>> I am now trying to implement "Did you Mean " search using 
>>>>> SpellChecker jar
>>>>> and Lucene jar.
>>>>>
>>>>> The problem I faced are plenty although I have got it working..
>>>>>
>>>>> code snippet:
>>>>>
>>>>> File dir = new File("D:\\Inde\\");
>>>>> Directory directory = FSDirectory.open(dir);
>>>>> SpellChecker spellChecker = new SpellChecker(directory);
>>>>> String wordForSuggestions = "aski";
>>>>> Analyzer analyzer=new 
>>>>> CustomAnalyzerForCaseSensitive**(Version.LUCENE_43);
>>>>>   //This analyzer only has commented LowerCaseFilter.
>>>>> IndexWriterConfig iwc = new IndexWriterConfig(Version.**LUCENE_43,
>>>>> analyzer);
>>>>> IndexWriter writer = new IndexWriter(directory, iwc);
>>>>> File file1=new File("D:\\Inde\\wordlist.txt")**;
>>>>> indexDocs(writer,file1);
>>>>> writer.close();
>>>>> spellChecker.indexDictionary(
>>>>> new PlainTextDictionary(new File("D:\\Inde\\wordlist.txt")**), iwc,
>>>>> false);
>>>>> int suggestionsNumber = 10;
>>>>> String[] suggestions = spellChecker.
>>>>> suggestSimilar(**wordForSuggestions, suggestionsNumber);
>>>>> if (suggestions!=null&&  suggestions.length>0) {
>>>>>
>>>>>              for (String word : suggestions) {
>>>>>
>>>>>                  System.out.println("Did you mean:" + word + "");
>>>>>
>>>>>              }
>>>>>
>>>>>          }
>>>>> else {
>>>>>
>>>>>              System.out.println("No suggestions found for
>>>>> word:"+wordForSuggestions);
>>>>>
>>>>>          }
>>>>>
>>>>> The code works fine. It suggest me 10 possible matches.
>>>>> Problem is here I am creating/updating indexes everytime.
>>>>>
>>>>> Say suppose I have 1000 log files and these files are indexed in
>>>>> D:\\LogIndexes. Instead of reading a standard dictionary and 
>>>>> building up
>>>>> indexes, I wish to use these indexes to suggest me possible match..
>>>>>
>>>>> Is it possible to do?. If yes, what can be the approach. Please 
>>>>> provide
>>>>> some assistance.
>>>>>
>>>>
>>>>   Check out DirectSpellChecker (
>>>> http://lucene.apache.org/core/4_3_1/suggest/org/apache/lucene/search/spell/DirectSpellChecker.html 
>>>>
>>>>   )
>>>>
>>>> Using DirectSpellChecker you do not need to build a separate spell 
>>>> index,
>>>> instead using the actual index for spell suggestions.
>>>>
>>>>
>>>>
>>>>> Next question would be to suggest a phrase. If I enter "Head ach 
>>>>> heav" ,
>>>>> then I should get "Head ache heavy" as one possible suggestion. 
>>>>> haven't
>>>>> tried it yet but surely will be an absolute beauty to have it..
>>>>>
>>>> DirectSpellChecker works on a term so there is no feature which 
>>>> will give
>>>> you suggestions on a phrase out of the box.
>>>>
>>>> You might want to take each term of the query and check for spell 
>>>> mistakes,
>>>> and then combine them back again. You could look up the code from
>>>> Solr.SpellCheckComponent.addCollationsToResponse
>>>>
>>>> http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.collate
>>>>
>>>> http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/java/org/apache/solr/handler/component/SpellCheckComponent.java 
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>> Also examples available on net for "Did you mean" are very very 
>>>>> old and
>>>>> API have undergone significant changes thus making them not so 
>>>>> very useful.
>>>>>
>>>>>
>>>>> -- 
>>>>> Regards
>>>>>
>>>>> Ankit Murarka
>>>>>
>>>>> "Peace is found not in what surrounds us, but in what we hold 
>>>>> within."
>>>>>
>>>>>
>>>>> ------------------------------**------------------------------**--------- 
>>>>>
>>>>> To unsubscribe, e-mail: 
>>>>> java-user-unsubscribe@lucene.**apache.org<ja...@lucene.apache.org> 
>>>>>
>>>>> For additional commands, e-mail: 
>>>>> java-user-help@lucene.apache.**org<ja...@lucene.apache.org>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>>
>
>


-- 
Regards

Ankit Murarka

"What lies behind us and what lies before us are tiny matters compared with what lies within us"


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org