You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by fp235-5 <ju...@lingway.com> on 2004/07/15 21:18:37 UTC

Re: release & migration plan

I am looking at the code to implement setIndexInterval() in IndexWriter. I'd
like to have your opinion on the best way to do it.

Currently the creation of an instance of TermInfosWriter requires the following
steps:
...
IndexWriter.addDocument(Document)
IndexWriter.addDocument(Document, Analyser)
DocumentWriter.addDocument(String, Document)
DocumentWriter.writePostings(Posting[],String)
TermInfosWriter.<init>

To give a different value to indexInterval in TermInfosWriter, we need to add a
variable holding this value into IndexWriter and DocumentWriter and modify the
constructors for DocumentWriter and TermInfosWriter. (quite heavy changes)

Another option is to use a static variable in IndexWriter or TermInfosWriter and
 access it directly.(quite dirty programming)

What would be the best solution for you? Is there another one?

Also what would be the effect of changing this value between the indexation of
two documents? (harmless ???)

Julien 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: release & migration plan

Posted by Doug Cutting <cu...@apache.org>.
For the purposes of this change, the DocumentWriter directory doesn't 
actually matter.  A persistent index is only written by the segment 
merger, so that's the only place the indexInterval really needs to be 
specified.

Doug

Julien Nioche wrote:
> DocumentWriter is typically created with the
> ramDirectory field of IndexWriter and not the actual directory field.
> So getDirectory() should return this ramDirectory in order to work,
> which is not very intuitive (one could expect it to return the real
> directory). One could change the visibility of ramDirectory to package
> so that the DocumentWriter could access it??? Is it a clean way to do?
> 
> 
> ----- Original Message ----- 
> From: "Doug Cutting" <cu...@apache.org>
> To: "Lucene Users List" <lu...@jakarta.apache.org>
> Sent: Thursday, July 15, 2004 11:06 PM
> Subject: Re: release & migration plan
> 
> 
> 
>>fp235-5 wrote:
>>
>>>I am looking at the code to implement setIndexInterval() in IndexWriter.
> 
> I'd
> 
>>>like to have your opinion on the best way to do it.
>>>
>>>Currently the creation of an instance of TermInfosWriter requires the
> 
> following
> 
>>>steps:
>>>...
>>>IndexWriter.addDocument(Document)
>>>IndexWriter.addDocument(Document, Analyser)
>>>DocumentWriter.addDocument(String, Document)
>>>DocumentWriter.writePostings(Posting[],String)
>>>TermInfosWriter.<init>
>>>
>>>To give a different value to indexInterval in TermInfosWriter, we need
> 
> to add a
> 
>>>variable holding this value into IndexWriter and DocumentWriter and
> 
> modify the
> 
>>>constructors for DocumentWriter and TermInfosWriter. (quite heavy
> 
> changes)
> 
>>I think this is the best approach.  I would replace other parameters in
>>these constructors which can be derived from an IndexWriter with the
>>IndexWriter.  That way, if we add more parameters like this, they can
>>also be passed in through the IndexWriter.
>>
>>All of the parameters to the DocumentWriter constructor are fields of
>>IndexWriter.  So one can instead simply pass a single parameter, an
>>IndexWriter, then access its directory, analyzer, similarity and
>>maxFieldLength in the DocumentWriter constructor.  A public
>>getDirectory() method would also need to be added to IndexWriter for
>>this to work.
>>
>>Similarly, two of SegmentMerger's constructor parameters could be
>>replaced with an IndexWriter, the directory and boolean useCompoundFile.
>>
>>In SegmentMerge I would replace the directory parameter with IndexWriter.
>>
>>Doug
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>
>>
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Re: release & migration plan

Posted by Julien Nioche <Ju...@lingway.com>.
DocumentWriter is typically created with the
ramDirectory field of IndexWriter and not the actual directory field.
So getDirectory() should return this ramDirectory in order to work,
which is not very intuitive (one could expect it to return the real
directory). One could change the visibility of ramDirectory to package
so that the DocumentWriter could access it??? Is it a clean way to do?


----- Original Message ----- 
From: "Doug Cutting" <cu...@apache.org>
To: "Lucene Users List" <lu...@jakarta.apache.org>
Sent: Thursday, July 15, 2004 11:06 PM
Subject: Re: release & migration plan


> fp235-5 wrote:
> > I am looking at the code to implement setIndexInterval() in IndexWriter.
I'd
> > like to have your opinion on the best way to do it.
> >
> > Currently the creation of an instance of TermInfosWriter requires the
following
> > steps:
> > ...
> > IndexWriter.addDocument(Document)
> > IndexWriter.addDocument(Document, Analyser)
> > DocumentWriter.addDocument(String, Document)
> > DocumentWriter.writePostings(Posting[],String)
> > TermInfosWriter.<init>
> >
> > To give a different value to indexInterval in TermInfosWriter, we need
to add a
> > variable holding this value into IndexWriter and DocumentWriter and
modify the
> > constructors for DocumentWriter and TermInfosWriter. (quite heavy
changes)
>
> I think this is the best approach.  I would replace other parameters in
> these constructors which can be derived from an IndexWriter with the
> IndexWriter.  That way, if we add more parameters like this, they can
> also be passed in through the IndexWriter.
>
> All of the parameters to the DocumentWriter constructor are fields of
> IndexWriter.  So one can instead simply pass a single parameter, an
> IndexWriter, then access its directory, analyzer, similarity and
> maxFieldLength in the DocumentWriter constructor.  A public
> getDirectory() method would also need to be added to IndexWriter for
> this to work.
>
> Similarly, two of SegmentMerger's constructor parameters could be
> replaced with an IndexWriter, the directory and boolean useCompoundFile.
>
> In SegmentMerge I would replace the directory parameter with IndexWriter.
>
> Doug
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Re: release & migration plan

Posted by Doug Cutting <cu...@apache.org>.
fp235-5 wrote:
> I am looking at the code to implement setIndexInterval() in IndexWriter. I'd
> like to have your opinion on the best way to do it.
> 
> Currently the creation of an instance of TermInfosWriter requires the following
> steps:
> ...
> IndexWriter.addDocument(Document)
> IndexWriter.addDocument(Document, Analyser)
> DocumentWriter.addDocument(String, Document)
> DocumentWriter.writePostings(Posting[],String)
> TermInfosWriter.<init>
> 
> To give a different value to indexInterval in TermInfosWriter, we need to add a
> variable holding this value into IndexWriter and DocumentWriter and modify the
> constructors for DocumentWriter and TermInfosWriter. (quite heavy changes)

I think this is the best approach.  I would replace other parameters in 
these constructors which can be derived from an IndexWriter with the 
IndexWriter.  That way, if we add more parameters like this, they can 
also be passed in through the IndexWriter.

All of the parameters to the DocumentWriter constructor are fields of 
IndexWriter.  So one can instead simply pass a single parameter, an 
IndexWriter, then access its directory, analyzer, similarity and 
maxFieldLength in the DocumentWriter constructor.  A public 
getDirectory() method would also need to be added to IndexWriter for 
this to work.

Similarly, two of SegmentMerger's constructor parameters could be 
replaced with an IndexWriter, the directory and boolean useCompoundFile.

In SegmentMerge I would replace the directory parameter with IndexWriter.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org