You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by blazingwolf7 <bl...@gmail.com> on 2008/07/02 11:46:38 UTC

readVInt, what is it for?

Hi, 

I am fairly new to Lucene and is now currently going through its source
code. I am currently trying to determine how Lucene calculate the frequency
of a term in each document located.

I encounter a method named readVInt() in IndexInput class. It seems
everytime it called this method it will be able to generate the document
number and the frequency of the term in each document.

I am wondering how it work and fail to find and information on it on the
Internet. Could anyone explain it to me? Thanks
-- 
View this message in context: http://www.nabble.com/readVInt%2C-what-is-it-for--tp18233802p18233802.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


RE: readVInt, what is it for?

Posted by blazingwolf7 <bl...@gmail.com>.
Thanks for all the help. I understand how it works already. Now I will have
to know how to modify the .frq file. Can anyone help  me with this? 


Mukherjee, Prasenjit wrote:
> 
> The slide16 in the following ppt might be of some help. Let me know if
> it helps. 
> 
> http://docs.google.com/Presentation?docid=dmsxgtg_98dbh529dn
> 
> -Prasen 
> 
> -----Original Message-----
> From: Grant Ingersoll [mailto:gsingers@apache.org] 
> Sent: Thursday, July 03, 2008 8:08 AM
> To: java-dev@lucene.apache.org
> Subject: Re: readVInt, what is it for?
> 
> I'd suggest starting with a couple of places:
> http://lucene.apache.org/java/2_3_2/fileformats.html
> 
> and
> 
> http://lucene.apache.org/java/2_3_2/scoring.html
> 
> and then do as Yonik said and step through the internals, starting with
> a simple TermQuery which leads to the TermScorer.
> 
> -Grant
> 
> 
> On Jul 2, 2008, at 10:04 PM, blazingwolf7 wrote:
> 
>>
>> Hmmm, I don't think I get it. How is it tracked during index time? I 
>> index my file earlier. Later I will open the index and perform a 
>> search.
>> Shouldn't
>> the frequency of each term in each document found be calculated at 
>> during the searching process?
>>
>>
>> Yonik Seeley wrote:
>>>
>>> The frequency is tracked at index time.  It's simply a read at query 
>>> time.  See TermDocs.
>>> If you really want to understand more about the code internals of 
>>> Lucene, I'd suggest stepping through more example queries with a 
>>> debugger.
>>>
>>> -Yonik
>>>
>>> On Wed, Jul 2, 2008 at 8:49 PM, blazingwolf7 <bl...@gmail.com>
>>> wrote:
>>>>
>>>> Thanks, I am clear now on that. But do anyone know where is the 
>>>> frequency of the term for each document calculated? I mean which 
>>>> class it may be in and which method?
>>>> Thanks
>>>>
>>>>
>>>> Uwe Schindler wrote:
>>>>>
>>>>> A VInt is the way, how integers are stored in the index file in a 
>>>>> compressed and variable length manner.
>>>>>
>>>>> Read here: http://lucene.apache.org/java/2_3_2/
>>>>> fileformats.html#VInt
>>>>>
>>>>> -----
>>>>> Uwe Schindler
>>>>> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
>>>>> eMail: uwe@thetaphi.de
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: blazingwolf7 [mailto:blazingwolf7@gmail.com]
>>>>>> Sent: Wednesday, July 02, 2008 11:47 AM
>>>>>> To: java-dev@lucene.apache.org
>>>>>> Subject: readVInt, what is it for?
>>>>>>
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am fairly new to Lucene and is now currently going through its 
>>>>>> source code. I am currently trying to determine how Lucene 
>>>>>> calculate the frequency of a term in each document located.
>>>>>>
>>>>>> I encounter a method named readVInt() in IndexInput class. It 
>>>>>> seems everytime it called this method it will be able to generate 
>>>>>> the document number and the frequency of the term in each 
>>>>>> document.
>>>>>>
>>>>>> I am wondering how it work and fail to find and information on it 
>>>>>> on the Internet. Could anyone explain it to me? Thanks
>>>>>> --
>>>>>> View this message in context:
>>>>>> http://www.nabble.com/readVInt%2C-what-is-
>>>>>> it-for--tp18233802p18233802.html
>>>>>> Sent from the Lucene - Java Developer mailing list archive at 
>>>>>> Nabble.com.
>>>>>>
>>>>>>
>>>>>> ------------------------------------------------------------------
>>>>>> --- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>>>
>>>>>
>>>>>
>>>>> -------------------------------------------------------------------
>>>>> -- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>>>
>>>>>
>>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://www.nabble.com/readVInt%2C-what-is-it-for--tp18233802p1824979
>>>> 0.html Sent from the Lucene - Java Developer mailing list archive at
> 
>>>> Nabble.com.
>>>>
>>>>
>>>> --------------------------------------------------------------------
>>>> - To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>
>>>
>>>
>>
>> --
>> View this message in context: 
>> http://www.nabble.com/readVInt%2C-what-is-it-for--tp18233802p18250434.
>> html Sent from the Lucene - Java Developer mailing list archive at 
>> Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
> 
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com
> 
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
> 
> 
> 
> 
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/readVInt%2C-what-is-it-for--tp18233802p18253849.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


RE: readVInt, what is it for?

Posted by "Mukherjee, Prasenjit" <P....@corp.aol.com>.
The slide16 in the following ppt might be of some help. Let me know if
it helps. 

http://docs.google.com/Presentation?docid=dmsxgtg_98dbh529dn

-Prasen 

-----Original Message-----
From: Grant Ingersoll [mailto:gsingers@apache.org] 
Sent: Thursday, July 03, 2008 8:08 AM
To: java-dev@lucene.apache.org
Subject: Re: readVInt, what is it for?

I'd suggest starting with a couple of places:
http://lucene.apache.org/java/2_3_2/fileformats.html

and

http://lucene.apache.org/java/2_3_2/scoring.html

and then do as Yonik said and step through the internals, starting with
a simple TermQuery which leads to the TermScorer.

-Grant


On Jul 2, 2008, at 10:04 PM, blazingwolf7 wrote:

>
> Hmmm, I don't think I get it. How is it tracked during index time? I 
> index my file earlier. Later I will open the index and perform a 
> search.
> Shouldn't
> the frequency of each term in each document found be calculated at 
> during the searching process?
>
>
> Yonik Seeley wrote:
>>
>> The frequency is tracked at index time.  It's simply a read at query 
>> time.  See TermDocs.
>> If you really want to understand more about the code internals of 
>> Lucene, I'd suggest stepping through more example queries with a 
>> debugger.
>>
>> -Yonik
>>
>> On Wed, Jul 2, 2008 at 8:49 PM, blazingwolf7 <bl...@gmail.com>
>> wrote:
>>>
>>> Thanks, I am clear now on that. But do anyone know where is the 
>>> frequency of the term for each document calculated? I mean which 
>>> class it may be in and which method?
>>> Thanks
>>>
>>>
>>> Uwe Schindler wrote:
>>>>
>>>> A VInt is the way, how integers are stored in the index file in a 
>>>> compressed and variable length manner.
>>>>
>>>> Read here: http://lucene.apache.org/java/2_3_2/
>>>> fileformats.html#VInt
>>>>
>>>> -----
>>>> Uwe Schindler
>>>> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
>>>> eMail: uwe@thetaphi.de
>>>>
>>>>> -----Original Message-----
>>>>> From: blazingwolf7 [mailto:blazingwolf7@gmail.com]
>>>>> Sent: Wednesday, July 02, 2008 11:47 AM
>>>>> To: java-dev@lucene.apache.org
>>>>> Subject: readVInt, what is it for?
>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> I am fairly new to Lucene and is now currently going through its 
>>>>> source code. I am currently trying to determine how Lucene 
>>>>> calculate the frequency of a term in each document located.
>>>>>
>>>>> I encounter a method named readVInt() in IndexInput class. It 
>>>>> seems everytime it called this method it will be able to generate 
>>>>> the document number and the frequency of the term in each 
>>>>> document.
>>>>>
>>>>> I am wondering how it work and fail to find and information on it 
>>>>> on the Internet. Could anyone explain it to me? Thanks
>>>>> --
>>>>> View this message in context:
>>>>> http://www.nabble.com/readVInt%2C-what-is-
>>>>> it-for--tp18233802p18233802.html
>>>>> Sent from the Lucene - Java Developer mailing list archive at 
>>>>> Nabble.com.
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------
>>>>> --- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>>
>>>>
>>>>
>>>> -------------------------------------------------------------------
>>>> -- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>>
>>>>
>>>>
>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/readVInt%2C-what-is-it-for--tp18233802p1824979
>>> 0.html Sent from the Lucene - Java Developer mailing list archive at

>>> Nabble.com.
>>>
>>>
>>> --------------------------------------------------------------------
>>> - To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>>
>
> --
> View this message in context: 
> http://www.nabble.com/readVInt%2C-what-is-it-for--tp18233802p18250434.
> html Sent from the Lucene - Java Developer mailing list archive at 
> Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>

--------------------------
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ








---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: readVInt, what is it for?

Posted by Grant Ingersoll <gs...@apache.org>.
I'd suggest starting with a couple of places:
http://lucene.apache.org/java/2_3_2/fileformats.html

and

http://lucene.apache.org/java/2_3_2/scoring.html

and then do as Yonik said and step through the internals, starting  
with a simple TermQuery which leads to the TermScorer.

-Grant


On Jul 2, 2008, at 10:04 PM, blazingwolf7 wrote:

>
> Hmmm, I don't think I get it. How is it tracked during index time? I  
> index my
> file earlier. Later I will open the index and perform a search.  
> Shouldn't
> the frequency of each term in each document found be calculated at  
> during
> the searching process?
>
>
> Yonik Seeley wrote:
>>
>> The frequency is tracked at index time.  It's simply a read at query
>> time.  See TermDocs.
>> If you really want to understand more about the code internals of
>> Lucene, I'd suggest stepping through more example queries with a
>> debugger.
>>
>> -Yonik
>>
>> On Wed, Jul 2, 2008 at 8:49 PM, blazingwolf7 <bl...@gmail.com>
>> wrote:
>>>
>>> Thanks, I am clear now on that. But do anyone know where is the  
>>> frequency
>>> of
>>> the term for each document calculated? I mean which class it may  
>>> be in
>>> and
>>> which method?
>>> Thanks
>>>
>>>
>>> Uwe Schindler wrote:
>>>>
>>>> A VInt is the way, how integers are stored in the index file in a
>>>> compressed
>>>> and variable length manner.
>>>>
>>>> Read here: http://lucene.apache.org/java/2_3_2/ 
>>>> fileformats.html#VInt
>>>>
>>>> -----
>>>> Uwe Schindler
>>>> H.-H.-Meier-Allee 63, D-28213 Bremen
>>>> http://www.thetaphi.de
>>>> eMail: uwe@thetaphi.de
>>>>
>>>>> -----Original Message-----
>>>>> From: blazingwolf7 [mailto:blazingwolf7@gmail.com]
>>>>> Sent: Wednesday, July 02, 2008 11:47 AM
>>>>> To: java-dev@lucene.apache.org
>>>>> Subject: readVInt, what is it for?
>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> I am fairly new to Lucene and is now currently going through its  
>>>>> source
>>>>> code. I am currently trying to determine how Lucene calculate the
>>>>> frequency
>>>>> of a term in each document located.
>>>>>
>>>>> I encounter a method named readVInt() in IndexInput class. It  
>>>>> seems
>>>>> everytime it called this method it will be able to generate the
>>>>> document
>>>>> number and the frequency of the term in each document.
>>>>>
>>>>> I am wondering how it work and fail to find and information on  
>>>>> it on
>>>>> the
>>>>> Internet. Could anyone explain it to me? Thanks
>>>>> --
>>>>> View this message in context:
>>>>> http://www.nabble.com/readVInt%2C-what-is-
>>>>> it-for--tp18233802p18233802.html
>>>>> Sent from the Lucene - Java Developer mailing list archive at
>>>>> Nabble.com.
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>>
>>>>
>>>>
>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/readVInt%2C-what-is-it-for--tp18233802p18249790.html
>>> Sent from the Lucene - Java Developer mailing list archive at  
>>> Nabble.com.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>>
>
> -- 
> View this message in context: http://www.nabble.com/readVInt%2C-what-is-it-for--tp18233802p18250434.html
> Sent from the Lucene - Java Developer mailing list archive at  
> Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>

--------------------------
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ








---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: readVInt, what is it for?

Posted by Yonik Seeley <yo...@apache.org>.
Lucene creates an inverted index and uses it to search.
Frequency is encoded in the .frq files:
http://lucene.apache.org/java/docs/fileformats.html

-Yonik

On Wed, Jul 2, 2008 at 10:04 PM, blazingwolf7 <bl...@gmail.com> wrote:
>
> Hmmm, I don't think I get it. How is it tracked during index time? I index my
> file earlier. Later I will open the index and perform a search. Shouldn't
> the frequency of each term in each document found be calculated at during
> the searching process?
>
>
> Yonik Seeley wrote:
>>
>> The frequency is tracked at index time.  It's simply a read at query
>> time.  See TermDocs.
>> If you really want to understand more about the code internals of
>> Lucene, I'd suggest stepping through more example queries with a
>> debugger.
>>
>> -Yonik
>>
>> On Wed, Jul 2, 2008 at 8:49 PM, blazingwolf7 <bl...@gmail.com>
>> wrote:
>>>
>>> Thanks, I am clear now on that. But do anyone know where is the frequency
>>> of
>>> the term for each document calculated? I mean which class it may be in
>>> and
>>> which method?
>>> Thanks
>>>
>>>
>>> Uwe Schindler wrote:
>>>>
>>>> A VInt is the way, how integers are stored in the index file in a
>>>> compressed
>>>> and variable length manner.
>>>>
>>>> Read here: http://lucene.apache.org/java/2_3_2/fileformats.html#VInt
>>>>
>>>> -----
>>>> Uwe Schindler
>>>> H.-H.-Meier-Allee 63, D-28213 Bremen
>>>> http://www.thetaphi.de
>>>> eMail: uwe@thetaphi.de
>>>>
>>>>> -----Original Message-----
>>>>> From: blazingwolf7 [mailto:blazingwolf7@gmail.com]
>>>>> Sent: Wednesday, July 02, 2008 11:47 AM
>>>>> To: java-dev@lucene.apache.org
>>>>> Subject: readVInt, what is it for?
>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> I am fairly new to Lucene and is now currently going through its source
>>>>> code. I am currently trying to determine how Lucene calculate the
>>>>> frequency
>>>>> of a term in each document located.
>>>>>
>>>>> I encounter a method named readVInt() in IndexInput class. It seems
>>>>> everytime it called this method it will be able to generate the
>>>>> document
>>>>> number and the frequency of the term in each document.
>>>>>
>>>>> I am wondering how it work and fail to find and information on it on
>>>>> the
>>>>> Internet. Could anyone explain it to me? Thanks
>>>>> --
>>>>> View this message in context:
>>>>> http://www.nabble.com/readVInt%2C-what-is-
>>>>> it-for--tp18233802p18233802.html
>>>>> Sent from the Lucene - Java Developer mailing list archive at
>>>>> Nabble.com.
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>>
>>>>
>>>>
>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/readVInt%2C-what-is-it-for--tp18233802p18249790.html
>>> Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>>
>
> --
> View this message in context: http://www.nabble.com/readVInt%2C-what-is-it-for--tp18233802p18250434.html
> Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: readVInt, what is it for?

Posted by blazingwolf7 <bl...@gmail.com>.
Hmmm, I don't think I get it. How is it tracked during index time? I index my
file earlier. Later I will open the index and perform a search. Shouldn't
the frequency of each term in each document found be calculated at during
the searching process?


Yonik Seeley wrote:
> 
> The frequency is tracked at index time.  It's simply a read at query
> time.  See TermDocs.
> If you really want to understand more about the code internals of
> Lucene, I'd suggest stepping through more example queries with a
> debugger.
> 
> -Yonik
> 
> On Wed, Jul 2, 2008 at 8:49 PM, blazingwolf7 <bl...@gmail.com>
> wrote:
>>
>> Thanks, I am clear now on that. But do anyone know where is the frequency
>> of
>> the term for each document calculated? I mean which class it may be in
>> and
>> which method?
>> Thanks
>>
>>
>> Uwe Schindler wrote:
>>>
>>> A VInt is the way, how integers are stored in the index file in a
>>> compressed
>>> and variable length manner.
>>>
>>> Read here: http://lucene.apache.org/java/2_3_2/fileformats.html#VInt
>>>
>>> -----
>>> Uwe Schindler
>>> H.-H.-Meier-Allee 63, D-28213 Bremen
>>> http://www.thetaphi.de
>>> eMail: uwe@thetaphi.de
>>>
>>>> -----Original Message-----
>>>> From: blazingwolf7 [mailto:blazingwolf7@gmail.com]
>>>> Sent: Wednesday, July 02, 2008 11:47 AM
>>>> To: java-dev@lucene.apache.org
>>>> Subject: readVInt, what is it for?
>>>>
>>>>
>>>> Hi,
>>>>
>>>> I am fairly new to Lucene and is now currently going through its source
>>>> code. I am currently trying to determine how Lucene calculate the
>>>> frequency
>>>> of a term in each document located.
>>>>
>>>> I encounter a method named readVInt() in IndexInput class. It seems
>>>> everytime it called this method it will be able to generate the
>>>> document
>>>> number and the frequency of the term in each document.
>>>>
>>>> I am wondering how it work and fail to find and information on it on
>>>> the
>>>> Internet. Could anyone explain it to me? Thanks
>>>> --
>>>> View this message in context:
>>>> http://www.nabble.com/readVInt%2C-what-is-
>>>> it-for--tp18233802p18233802.html
>>>> Sent from the Lucene - Java Developer mailing list archive at
>>>> Nabble.com.
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>
>>>
>>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/readVInt%2C-what-is-it-for--tp18233802p18249790.html
>> Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/readVInt%2C-what-is-it-for--tp18233802p18250434.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: readVInt, what is it for?

Posted by Yonik Seeley <yo...@apache.org>.
The frequency is tracked at index time.  It's simply a read at query
time.  See TermDocs.
If you really want to understand more about the code internals of
Lucene, I'd suggest stepping through more example queries with a
debugger.

-Yonik

On Wed, Jul 2, 2008 at 8:49 PM, blazingwolf7 <bl...@gmail.com> wrote:
>
> Thanks, I am clear now on that. But do anyone know where is the frequency of
> the term for each document calculated? I mean which class it may be in and
> which method?
> Thanks
>
>
> Uwe Schindler wrote:
>>
>> A VInt is the way, how integers are stored in the index file in a
>> compressed
>> and variable length manner.
>>
>> Read here: http://lucene.apache.org/java/2_3_2/fileformats.html#VInt
>>
>> -----
>> Uwe Schindler
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>> http://www.thetaphi.de
>> eMail: uwe@thetaphi.de
>>
>>> -----Original Message-----
>>> From: blazingwolf7 [mailto:blazingwolf7@gmail.com]
>>> Sent: Wednesday, July 02, 2008 11:47 AM
>>> To: java-dev@lucene.apache.org
>>> Subject: readVInt, what is it for?
>>>
>>>
>>> Hi,
>>>
>>> I am fairly new to Lucene and is now currently going through its source
>>> code. I am currently trying to determine how Lucene calculate the
>>> frequency
>>> of a term in each document located.
>>>
>>> I encounter a method named readVInt() in IndexInput class. It seems
>>> everytime it called this method it will be able to generate the document
>>> number and the frequency of the term in each document.
>>>
>>> I am wondering how it work and fail to find and information on it on the
>>> Internet. Could anyone explain it to me? Thanks
>>> --
>>> View this message in context: http://www.nabble.com/readVInt%2C-what-is-
>>> it-for--tp18233802p18233802.html
>>> Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>>
>
> --
> View this message in context: http://www.nabble.com/readVInt%2C-what-is-it-for--tp18233802p18249790.html
> Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


RE: readVInt, what is it for?

Posted by blazingwolf7 <bl...@gmail.com>.
Thanks, I am clear now on that. But do anyone know where is the frequency of
the term for each document calculated? I mean which class it may be in and
which method?
Thanks


Uwe Schindler wrote:
> 
> A VInt is the way, how integers are stored in the index file in a
> compressed
> and variable length manner.
> 
> Read here: http://lucene.apache.org/java/2_3_2/fileformats.html#VInt
> 
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
> 
>> -----Original Message-----
>> From: blazingwolf7 [mailto:blazingwolf7@gmail.com]
>> Sent: Wednesday, July 02, 2008 11:47 AM
>> To: java-dev@lucene.apache.org
>> Subject: readVInt, what is it for?
>> 
>> 
>> Hi,
>> 
>> I am fairly new to Lucene and is now currently going through its source
>> code. I am currently trying to determine how Lucene calculate the
>> frequency
>> of a term in each document located.
>> 
>> I encounter a method named readVInt() in IndexInput class. It seems
>> everytime it called this method it will be able to generate the document
>> number and the frequency of the term in each document.
>> 
>> I am wondering how it work and fail to find and information on it on the
>> Internet. Could anyone explain it to me? Thanks
>> --
>> View this message in context: http://www.nabble.com/readVInt%2C-what-is-
>> it-for--tp18233802p18233802.html
>> Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/readVInt%2C-what-is-it-for--tp18233802p18249790.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


RE: readVInt, what is it for?

Posted by Uwe Schindler <uw...@thetaphi.de>.
A VInt is the way, how integers are stored in the index file in a compressed
and variable length manner.

Read here: http://lucene.apache.org/java/2_3_2/fileformats.html#VInt

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: blazingwolf7 [mailto:blazingwolf7@gmail.com]
> Sent: Wednesday, July 02, 2008 11:47 AM
> To: java-dev@lucene.apache.org
> Subject: readVInt, what is it for?
> 
> 
> Hi,
> 
> I am fairly new to Lucene and is now currently going through its source
> code. I am currently trying to determine how Lucene calculate the
> frequency
> of a term in each document located.
> 
> I encounter a method named readVInt() in IndexInput class. It seems
> everytime it called this method it will be able to generate the document
> number and the frequency of the term in each document.
> 
> I am wondering how it work and fail to find and information on it on the
> Internet. Could anyone explain it to me? Thanks
> --
> View this message in context: http://www.nabble.com/readVInt%2C-what-is-
> it-for--tp18233802p18233802.html
> Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org