You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by blazingwolf7 <bl...@gmail.com> on 2008/07/03 04:30:31 UTC

Class in Lucene that Perform Search

Hi, 

I am currently using Lucene to build a search engine and is trying to
understand better so I am going through its source code. I track it all the
way from the beginning till end, and has managed to located all the class
that calculate the score and return the results.

What I am missing is that I fail to locate the class that perform the actual
comparison to determine if a query match any term in a document. I also fail
to locate the class that is responsible for retrieving the document that
contains the term specify. Can anyone help me with this? Maybe just tell me
the class related. Thanks
-- 
View this message in context: http://www.nabble.com/Class-in-Lucene-that-Perform-Search-tp18250664p18250664.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Class in Lucene that Perform Search

Posted by Yonik Seeley <yo...@apache.org>.
Moving to java-user.

Index the field and use the FieldCache.
You don't want to modify core lucene classes for this common use-case.

-Yonik

On Thu, Jul 3, 2008 at 10:37 PM, blazingwolf7 <bl...@gmail.com> wrote:
> I am trying to retrieve the contentLength and the URL of each document from
> the index without  continuously using IndexReader, eg:
> reader.document.get("ur");
>
> I am trying to find a way to retrieve all this value and stored it into an
> array by using the IndexReader only once or twice. I thought maybe I can
> store some extra value into the .frq file then I will have no need to
> continuously use the reader. Anyone can provide other suggestion? Thanks
>
>
> Yonik Seeley wrote:
>>
>> On Thu, Jul 3, 2008 at 4:03 AM, blazingwolf7 <bl...@gmail.com>
>> wrote:
>>> Ah, thanks! I am clear now. Have to change tactics to achieve what I
>>> need.
>>> Which class during indexing time will create the .frq file?
>>
>> DocumentsWriter (called from IndexWriter).
>>
>>> If possible, I want to add an extra value into it so that I can retrieve
>>> the
>>> information during the searching process. Thank
>>
>> Look at payloads first.
>> What problem are you trying to solve?  Someone may have an easier
>> approach for you if payloads doesn't work.
>>
>> -Yonik
>>
>>
>>
>>>
>>> Yonik Seeley wrote:
>>>>
>>>> On Wed, Jul 2, 2008 at 10:30 PM, blazingwolf7 <bl...@gmail.com>
>>>> wrote:
>>>>> What I am missing is that I fail to locate the class that perform the
>>>>> actual
>>>>> comparison to determine if a query match any term in a document.
>>>>
>>>> You need to understand the inverted index format.  Documents that
>>>> match a term is determined at index time, not at query time.  The .frq
>>>> file lists all documents that match each term.
>>>>
>>>> TermDocs iterates over all documents that match the term by reading
>>>> the .frq file.
>>>>
>>>> -Yonik
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>>
>>>>
>>>>
>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/Class-in-Lucene-that-Perform-Search-tp18250664p18253813.html
>>> Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>>
>
> --
> View this message in context: http://www.nabble.com/Class-in-Lucene-that-Perform-Search-tp18250664p18271691.html
> Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Class in Lucene that Perform Search

Posted by blazingwolf7 <bl...@gmail.com>.
I am trying to retrieve the contentLength and the URL of each document from
the index without  continuously using IndexReader, eg:
reader.document.get("ur");

I am trying to find a way to retrieve all this value and stored it into an
array by using the IndexReader only once or twice. I thought maybe I can
store some extra value into the .frq file then I will have no need to
continuously use the reader. Anyone can provide other suggestion? Thanks 


Yonik Seeley wrote:
> 
> On Thu, Jul 3, 2008 at 4:03 AM, blazingwolf7 <bl...@gmail.com>
> wrote:
>> Ah, thanks! I am clear now. Have to change tactics to achieve what I
>> need.
>> Which class during indexing time will create the .frq file?
> 
> DocumentsWriter (called from IndexWriter).
> 
>> If possible, I want to add an extra value into it so that I can retrieve
>> the
>> information during the searching process. Thank
> 
> Look at payloads first.
> What problem are you trying to solve?  Someone may have an easier
> approach for you if payloads doesn't work.
> 
> -Yonik
> 
> 
> 
>>
>> Yonik Seeley wrote:
>>>
>>> On Wed, Jul 2, 2008 at 10:30 PM, blazingwolf7 <bl...@gmail.com>
>>> wrote:
>>>> What I am missing is that I fail to locate the class that perform the
>>>> actual
>>>> comparison to determine if a query match any term in a document.
>>>
>>> You need to understand the inverted index format.  Documents that
>>> match a term is determined at index time, not at query time.  The .frq
>>> file lists all documents that match each term.
>>>
>>> TermDocs iterates over all documents that match the term by reading
>>> the .frq file.
>>>
>>> -Yonik
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>
>>>
>>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Class-in-Lucene-that-Perform-Search-tp18250664p18253813.html
>> Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Class-in-Lucene-that-Perform-Search-tp18250664p18271691.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Class in Lucene that Perform Search

Posted by Yonik Seeley <yo...@apache.org>.
On Thu, Jul 3, 2008 at 4:03 AM, blazingwolf7 <bl...@gmail.com> wrote:
> Ah, thanks! I am clear now. Have to change tactics to achieve what I need.
> Which class during indexing time will create the .frq file?

DocumentsWriter (called from IndexWriter).

> If possible, I want to add an extra value into it so that I can retrieve the
> information during the searching process. Thank

Look at payloads first.
What problem are you trying to solve?  Someone may have an easier
approach for you if payloads doesn't work.

-Yonik



>
> Yonik Seeley wrote:
>>
>> On Wed, Jul 2, 2008 at 10:30 PM, blazingwolf7 <bl...@gmail.com>
>> wrote:
>>> What I am missing is that I fail to locate the class that perform the
>>> actual
>>> comparison to determine if a query match any term in a document.
>>
>> You need to understand the inverted index format.  Documents that
>> match a term is determined at index time, not at query time.  The .frq
>> file lists all documents that match each term.
>>
>> TermDocs iterates over all documents that match the term by reading
>> the .frq file.
>>
>> -Yonik
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>>
>
> --
> View this message in context: http://www.nabble.com/Class-in-Lucene-that-Perform-Search-tp18250664p18253813.html
> Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Class in Lucene that Perform Search

Posted by blazingwolf7 <bl...@gmail.com>.
Ah, thanks! I am clear now. Have to change tactics to achieve what I need.
Which class during indexing time will create the .frq file?

If possible, I want to add an extra value into it so that I can retrieve the
information during the searching process. Thank


Yonik Seeley wrote:
> 
> On Wed, Jul 2, 2008 at 10:30 PM, blazingwolf7 <bl...@gmail.com>
> wrote:
>> What I am missing is that I fail to locate the class that perform the
>> actual
>> comparison to determine if a query match any term in a document.
> 
> You need to understand the inverted index format.  Documents that
> match a term is determined at index time, not at query time.  The .frq
> file lists all documents that match each term.
> 
> TermDocs iterates over all documents that match the term by reading
> the .frq file.
> 
> -Yonik
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Class-in-Lucene-that-Perform-Search-tp18250664p18253813.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Class in Lucene that Perform Search

Posted by Yonik Seeley <yo...@apache.org>.
On Wed, Jul 2, 2008 at 10:30 PM, blazingwolf7 <bl...@gmail.com> wrote:
> What I am missing is that I fail to locate the class that perform the actual
> comparison to determine if a query match any term in a document.

You need to understand the inverted index format.  Documents that
match a term is determined at index time, not at query time.  The .frq
file lists all documents that match each term.

TermDocs iterates over all documents that match the term by reading
the .frq file.

-Yonik

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org