You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Renaud Delbru <re...@deri.org> on 2010/01/07 13:46:32 UTC

Re: IndexingChain and TermHash

Hi Michael,

I have started to look at the PFOR codec. However, when I include the 
codec files inside the flex_1458 branch, it misses the 
org.apache.lucene.util.pfor.PFor class which is the core of the codec. 
Where can I find this class ?

Thanks,
Regards
-- 
Renaud Delbru

On 16/11/09 14:01, Michael McCandless wrote:
> Yes, the branch is here:
>
>      https://svn.apache.org/repos/asf/lucene/java/branches/flex_1458
>
> Mark (Miller) periodically re-sync's it to trunk.
>
> All tests should pass, and if you create a new Codec, please share the
> experience!
>
> There are not yet many Codecs in existence... the branch has the
> "standard" codec (closest to Lucene's current index format, but makes
> some compelling improvements to the terms dict), a "pulsing" codec
> (which inlines low-freq terms into the terms dict), an intblock codec
> (an abstract base for building int-block codecs).  There's also the
> PForDelta codec, attached to LUCENE-1410, which subclasses the
> intblock codec and uses PForDelta encoding.  It's probably best to
> peek at these example codecs for inspiration on how to build yours.
>
> Mike
>
> On Mon, Nov 16, 2009 at 7:28 AM, Renaud Delbru<re...@deri.org>  wrote:
>    
>> Hi Michael,
>>
>> I see there is already a huge amount of work already done in LUCENE-1458. Is
>> there a way to checkout the corresponding branch, and start to use it ? At
>> least, to see if I can extend it and create my own Codec.
>> I have started on my side to abstract the indexing chain of Lucene 2.9, in
>> order to be able to plug my own chain, but I have the impression that you've
>> done something similar already (with the codec abstraction). Would be a pity
>> to lose my time doing something less convenient that your appraoch.
>>
>> Thanks.
>> --
>> Renaud Delbru
>>
>> On 14/11/09 13:22, Michael McCandless wrote:
>>      
>>> On Fri, Nov 6, 2009 at 1:34 PM, Renaud Delbru<re...@deri.org>
>>>   wrote:
>>>
>>>        
>>>> Hi Michael,
>>>>
>>>> Thanks for the quick fix. I have tested it (indexing multiple documents +
>>>> searching), and it seems to work.
>>>>
>>>> On 06/11/09 18:09, Michael McCandless wrote:
>>>>
>>>>          
>>>>> To be honest, you are sort of forging new territory here :)
>>>>>
>>>>>
>>>>>            
>>>> I think so too, not an easy task ;o). I have seen that you have tried to
>>>> make modular the indexing chain of Lucene (DocumentsWriter). I still try
>>>> to
>>>> have a good understanding of the default indexing, but I would like to
>>>> see
>>>> how it is easy (or difficult) to modify the format of the postings. From
>>>> my
>>>> current understanding, it seems that only the consumer at the end of this
>>>> chain (FreqProxTermsWriter and its consumer FormatPostingsFieldsWriter)
>>>> has
>>>> to be changed to a certain extend.
>>>>
>>>>          
>>> Right, those two classes do the writing of the postings, currently.
>>>
>>> But with flexible indexing (LUCENE-1458), still in progress, we hope
>>> to make it more easily pluggable, the codec that actually reads&
>>> writes the postings.
>>>
>>> Mike
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>>        
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>      
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>    


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: IndexingChain and TermHash

Posted by Michael McCandless <lu...@mikemccandless.com>.
LUCENE-1410 has the PFor impl, that the PFor codec needs.

Mike

On Thu, Jan 7, 2010 at 7:46 AM, Renaud Delbru <re...@deri.org> wrote:
> Hi Michael,
>
> I have started to look at the PFOR codec. However, when I include the codec
> files inside the flex_1458 branch, it misses the
> org.apache.lucene.util.pfor.PFor class which is the core of the codec. Where
> can I find this class ?
>
> Thanks,
> Regards
> --
> Renaud Delbru
>
> On 16/11/09 14:01, Michael McCandless wrote:
>>
>> Yes, the branch is here:
>>
>>     https://svn.apache.org/repos/asf/lucene/java/branches/flex_1458
>>
>> Mark (Miller) periodically re-sync's it to trunk.
>>
>> All tests should pass, and if you create a new Codec, please share the
>> experience!
>>
>> There are not yet many Codecs in existence... the branch has the
>> "standard" codec (closest to Lucene's current index format, but makes
>> some compelling improvements to the terms dict), a "pulsing" codec
>> (which inlines low-freq terms into the terms dict), an intblock codec
>> (an abstract base for building int-block codecs).  There's also the
>> PForDelta codec, attached to LUCENE-1410, which subclasses the
>> intblock codec and uses PForDelta encoding.  It's probably best to
>> peek at these example codecs for inspiration on how to build yours.
>>
>> Mike
>>
>> On Mon, Nov 16, 2009 at 7:28 AM, Renaud Delbru<re...@deri.org>
>>  wrote:
>>
>>>
>>> Hi Michael,
>>>
>>> I see there is already a huge amount of work already done in LUCENE-1458.
>>> Is
>>> there a way to checkout the corresponding branch, and start to use it ?
>>> At
>>> least, to see if I can extend it and create my own Codec.
>>> I have started on my side to abstract the indexing chain of Lucene 2.9,
>>> in
>>> order to be able to plug my own chain, but I have the impression that
>>> you've
>>> done something similar already (with the codec abstraction). Would be a
>>> pity
>>> to lose my time doing something less convenient that your appraoch.
>>>
>>> Thanks.
>>> --
>>> Renaud Delbru
>>>
>>> On 14/11/09 13:22, Michael McCandless wrote:
>>>
>>>>
>>>> On Fri, Nov 6, 2009 at 1:34 PM, Renaud Delbru<re...@deri.org>
>>>>  wrote:
>>>>
>>>>
>>>>>
>>>>> Hi Michael,
>>>>>
>>>>> Thanks for the quick fix. I have tested it (indexing multiple documents
>>>>> +
>>>>> searching), and it seems to work.
>>>>>
>>>>> On 06/11/09 18:09, Michael McCandless wrote:
>>>>>
>>>>>
>>>>>>
>>>>>> To be honest, you are sort of forging new territory here :)
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> I think so too, not an easy task ;o). I have seen that you have tried
>>>>> to
>>>>> make modular the indexing chain of Lucene (DocumentsWriter). I still
>>>>> try
>>>>> to
>>>>> have a good understanding of the default indexing, but I would like to
>>>>> see
>>>>> how it is easy (or difficult) to modify the format of the postings.
>>>>> From
>>>>> my
>>>>> current understanding, it seems that only the consumer at the end of
>>>>> this
>>>>> chain (FreqProxTermsWriter and its consumer FormatPostingsFieldsWriter)
>>>>> has
>>>>> to be changed to a certain extend.
>>>>>
>>>>>
>>>>
>>>> Right, those two classes do the writing of the postings, currently.
>>>>
>>>> But with flexible indexing (LUCENE-1458), still in progress, we hope
>>>> to make it more easily pluggable, the codec that actually reads&
>>>> writes the postings.
>>>>
>>>> Mike
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org