You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Ankit Murarka <an...@rancoretech.com> on 2013/08/02 07:39:43 UTC
Re: Did you Mean search on Indexes created by Different Files. -Completed.
This may sound bit akward but I am now able to implement Did you mean
search on Indexes.
The only help came from the "Lucene In Action" book.
It occured to me that Once I create index of my documents, I need to
pass these indexes to SpellCheck to create his own Index.(in a new
directory obviously).
Then I gave this new directory path to the spellChecker to search and it
gave me what I wanted. Word Suggestions from the documents I supplied as
an input.
Hopefully someone may find it useful..
On 8/1/2013 10:44 AM, Ankit Murarka wrote:
> Can anyone please guide me on how to implement Did You Mean Search
> using indexes created from the supplied bunch of files as an input.
>
> On 7/31/2013 11:15 AM, Ankit Murarka wrote:
>> Any help on this will be highly appreciated..I have been trying all
>> possible different option but to no avail.
>>
>> Also tried LuceneDictionary BUT THIS ALSO DOES NOT SEEM TO BE HELPING...
>>
>> Please guide.
>>
>> On 7/30/2013 4:49 PM, Ankit Murarka wrote:
>>> Hello.
>>>
>>> Using DirectSpellChecker is not serving my purpose. This seems to
>>> return word suggestions from a dictionary whereas I wish to return
>>> search suggestion from Indexes I created supplying my own Files
>>> (These files are generally log files).
>>>
>>> I created indexes for certain files in D:\\Indexes and the field
>>> name is "content"
>>>
>>> Then I used DirectSpellChecker and provided IndexReader argument to
>>> it. Invoked SuggestSimilar function and SuggestWords array as the
>>> output. Iterated over the array .
>>>
>>> I seem to get suggested words from the dictionary and not from the
>>> indexes.
>>>
>>> Code Snippet for the search part:
>>>
>>> String index="D:\\Indexes";
>>> String field = "contents";
>>> IndexReader reader = DirectoryReader.open(FSDirectory.open(new
>>> File(index)));
>>> DirectSpellChecker dsc=new DirectSpellChecker();
>>> Term term1=new Term(field, "Amrih");
>>> SuggestWord[] suggestWord=dsc.suggestSimilar(term1, 10, reader);
>>> if(suggestWord!=null && suggestWord.length>0)
>>> {
>>> for(SuggestWord word:suggestWord)
>>> {
>>> System.out.println("Did you Mean " + word.string );
>>> }
>>>
>>> }
>>> else
>>> {
>>> System.out.println("No Suggestions found");
>>> }
>>>
>>>
>>> Please guide. Basically the suggested words should be provided from
>>> the indexes I have created.. It should not come from any
>>> dictionary.. Is it possible ?
>>>
>>>
>>> On 7/29/2013 9:34 PM, Varun Thacker wrote:
>>>> Hi,
>>>>
>>>>
>>>> On Mon, Jul 29, 2013 at 4:36 PM, Ankit Murarka<
>>>> ankit.murarka@rancoretech.com> wrote:
>>>>
>>>>> Since I am new to this, I can't stop exploring it and trying to use
>>>>> different features.
>>>>>
>>>>> I am now trying to implement "Did you Mean " search using
>>>>> SpellChecker jar
>>>>> and Lucene jar.
>>>>>
>>>>> The problem I faced are plenty although I have got it working..
>>>>>
>>>>> code snippet:
>>>>>
>>>>> File dir = new File("D:\\Inde\\");
>>>>> Directory directory = FSDirectory.open(dir);
>>>>> SpellChecker spellChecker = new SpellChecker(directory);
>>>>> String wordForSuggestions = "aski";
>>>>> Analyzer analyzer=new
>>>>> CustomAnalyzerForCaseSensitive**(Version.LUCENE_43);
>>>>> //This analyzer only has commented LowerCaseFilter.
>>>>> IndexWriterConfig iwc = new IndexWriterConfig(Version.**LUCENE_43,
>>>>> analyzer);
>>>>> IndexWriter writer = new IndexWriter(directory, iwc);
>>>>> File file1=new File("D:\\Inde\\wordlist.txt")**;
>>>>> indexDocs(writer,file1);
>>>>> writer.close();
>>>>> spellChecker.indexDictionary(
>>>>> new PlainTextDictionary(new File("D:\\Inde\\wordlist.txt")**), iwc,
>>>>> false);
>>>>> int suggestionsNumber = 10;
>>>>> String[] suggestions = spellChecker.
>>>>> suggestSimilar(**wordForSuggestions, suggestionsNumber);
>>>>> if (suggestions!=null&& suggestions.length>0) {
>>>>>
>>>>> for (String word : suggestions) {
>>>>>
>>>>> System.out.println("Did you mean:" + word + "");
>>>>>
>>>>> }
>>>>>
>>>>> }
>>>>> else {
>>>>>
>>>>> System.out.println("No suggestions found for
>>>>> word:"+wordForSuggestions);
>>>>>
>>>>> }
>>>>>
>>>>> The code works fine. It suggest me 10 possible matches.
>>>>> Problem is here I am creating/updating indexes everytime.
>>>>>
>>>>> Say suppose I have 1000 log files and these files are indexed in
>>>>> D:\\LogIndexes. Instead of reading a standard dictionary and
>>>>> building up
>>>>> indexes, I wish to use these indexes to suggest me possible match..
>>>>>
>>>>> Is it possible to do?. If yes, what can be the approach. Please
>>>>> provide
>>>>> some assistance.
>>>>>
>>>>
>>>> Check out DirectSpellChecker (
>>>> http://lucene.apache.org/core/4_3_1/suggest/org/apache/lucene/search/spell/DirectSpellChecker.html
>>>>
>>>> )
>>>>
>>>> Using DirectSpellChecker you do not need to build a separate spell
>>>> index,
>>>> instead using the actual index for spell suggestions.
>>>>
>>>>
>>>>
>>>>> Next question would be to suggest a phrase. If I enter "Head ach
>>>>> heav" ,
>>>>> then I should get "Head ache heavy" as one possible suggestion.
>>>>> haven't
>>>>> tried it yet but surely will be an absolute beauty to have it..
>>>>>
>>>> DirectSpellChecker works on a term so there is no feature which
>>>> will give
>>>> you suggestions on a phrase out of the box.
>>>>
>>>> You might want to take each term of the query and check for spell
>>>> mistakes,
>>>> and then combine them back again. You could look up the code from
>>>> Solr.SpellCheckComponent.addCollationsToResponse
>>>>
>>>> http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.collate
>>>>
>>>> http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/java/org/apache/solr/handler/component/SpellCheckComponent.java
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>> Also examples available on net for "Did you mean" are very very
>>>>> old and
>>>>> API have undergone significant changes thus making them not so
>>>>> very useful.
>>>>>
>>>>>
>>>>> --
>>>>> Regards
>>>>>
>>>>> Ankit Murarka
>>>>>
>>>>> "Peace is found not in what surrounds us, but in what we hold
>>>>> within."
>>>>>
>>>>>
>>>>> ------------------------------**------------------------------**---------
>>>>>
>>>>> To unsubscribe, e-mail:
>>>>> java-user-unsubscribe@lucene.**apache.org<ja...@lucene.apache.org>
>>>>>
>>>>> For additional commands, e-mail:
>>>>> java-user-help@lucene.apache.**org<ja...@lucene.apache.org>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>>
>
>
--
Regards
Ankit Murarka
"What lies behind us and what lies before us are tiny matters compared with what lies within us"
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org