You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Otis Gospodnetic <ot...@yahoo.com> on 2003/10/12 19:02:18 UTC
Re:[PATCH]_IndexWriter_:_controling_the_number_of_Docs_merged_
Thanks Julien, I put your patch in Bugzilla, so we don't lose it.
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=23754
Otis
--- fp235-5 <ju...@lingway.com> wrote:
> Sorry, here is the patch ;-)
>
>
> ---------- Debut du message initial -----------
>
> De : "fp235-5" <ju...@lingway.com>
> A : "lucene-dev" <lu...@jakarta.apache.org>
> Copies :
> Date : Sat, 20 Sep 2003 16:06:06 +0200
> Sujet : [PATCH] IndexWriter : controling the number of Docs merged
>
> Hello,
>
> Someone made a suggestion yesterday about adding a variable to
> IndexWriter in
> order to control the number of Documents merged in RAMDirectory
> independently of
> the mergeFactor. (I'm sorry I don't remember who exactly and the mail
> arrived at
> my office).
> I'm proposing a tiny modification of IndexWriter to add this
> functionality. A
> variable minMergeDocs specifies the number of Documents to be merged
> in memory
> before starting a new Segment. The mergeFactor still control the
> number of
> Segments created in the Directory and thus it's possible to avoid the
> file
> number limitation problem.
>
> The diff file is attached.
>
> As noticed by Dmitry and Erik there are no true JUnit tests. I'd be
> OK to write
> a JUnit test for this feature. The problem is that the SegmentInfos
> field is
> private in IndexWriter and can't be used to check the number and size
> of the
> Segments. I ran a test using the infoStream variable of IndexWriter -
> everything
> seems to be OK.
>
> Any comments / suggestions are welcome.
>
> Regards
>
> Julien
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>
>
> > Index: IndexWriter.java
> ===================================================================
> RCS file:
>
/home/cvspublic/jakarta-lucene/src/java/org/apache/lucene/index/IndexWriter.java,v
> retrieving revision 1.15
> diff -u -r1.15 IndexWriter.java
> --- IndexWriter.java 15 Sep 2003 12:40:23 -0000 1.15
> +++ IndexWriter.java 20 Sep 2003 12:22:13 -0000
> @@ -249,6 +249,16 @@
> *
> * <p>This must never be less than 2. The default value is 10.*/
> public int mergeFactor = 10;
> +
> + /** Determines the minimal number of documents required before
> merging
> + * and starting a new Segment. Since Documents are merged in a
> + * {@link org.apache.lucene.store.RAMDirectory}, large value gives
> faster
> + * indexing. At the same time mergeFactor limits the number of
> files open in
> + * a FSDirectory.
> + *
> + * <p> The default value is 10.*/
> + public int minMergeDocs = 10;
> +
>
> /** Determines the largest number of documents ever merged by
> addDocument().
> * Small values (e.g., less than 10,000) are best for interactive
> indexing,
> @@ -316,7 +326,7 @@
>
> /** Incremental segment merger. */
> private final void maybeMergeSegments() throws IOException {
> - long targetMergeDocs = mergeFactor;
> + long targetMergeDocs = minMergeDocs;
> while (targetMergeDocs <= maxMergeDocs) {
> // find segments smaller than current target size
> int minSegment = segmentInfos.size();
> >
---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
__________________________________
Do you Yahoo!?
The New Yahoo! Shopping - with improved product search
http://shopping.yahoo.com
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
index summary facilities
Posted by Alex Aw Seat Kiong <al...@bigonthenet.com>.
Hi !
Are the lucene indexer library included the index summary facilities?
Like,
- total record was indexed.
- List of the sources(uid) was indexed
- total record was deleted.
- List of the sources(uid) was deleted
And, how to use the searcher to retrieve back all the records was
indexed/deleted?
What type the Query String to do this?
thkx...
AlexAw
----- Original Message -----
From: "Otis Gospodnetic" <ot...@yahoo.com>
To: "Lucene Developers List" <lu...@jakarta.apache.org>
Cc: <ju...@lingway.com>
Sent: Monday, October 13, 2003 1:02 AM
Subject: Re:[PATCH]_IndexWriter_:_controling_the_number_of_Docs_merged_
> Thanks Julien, I put your patch in Bugzilla, so we don't lose it.
> http://nagoya.apache.org/bugzilla/show_bug.cgi?id=23754
>
> Otis
>
>
> --- fp235-5 <ju...@lingway.com> wrote:
> > Sorry, here is the patch ;-)
> >
> >
> > ---------- Debut du message initial -----------
> >
> > De : "fp235-5" <ju...@lingway.com>
> > A : "lucene-dev" <lu...@jakarta.apache.org>
> > Copies :
> > Date : Sat, 20 Sep 2003 16:06:06 +0200
> > Sujet : [PATCH] IndexWriter : controling the number of Docs merged
> >
> > Hello,
> >
> > Someone made a suggestion yesterday about adding a variable to
> > IndexWriter in
> > order to control the number of Documents merged in RAMDirectory
> > independently of
> > the mergeFactor. (I'm sorry I don't remember who exactly and the mail
> > arrived at
> > my office).
> > I'm proposing a tiny modification of IndexWriter to add this
> > functionality. A
> > variable minMergeDocs specifies the number of Documents to be merged
> > in memory
> > before starting a new Segment. The mergeFactor still control the
> > number of
> > Segments created in the Directory and thus it's possible to avoid the
> > file
> > number limitation problem.
> >
> > The diff file is attached.
> >
> > As noticed by Dmitry and Erik there are no true JUnit tests. I'd be
> > OK to write
> > a JUnit test for this feature. The problem is that the SegmentInfos
> > field is
> > private in IndexWriter and can't be used to check the number and size
> > of the
> > Segments. I ran a test using the infoStream variable of IndexWriter -
> > everything
> > seems to be OK.
> >
> > Any comments / suggestions are welcome.
> >
> > Regards
> >
> > Julien
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
> >
> >
> > > Index: IndexWriter.java
> > ===================================================================
> > RCS file:
> >
>
/home/cvspublic/jakarta-lucene/src/java/org/apache/lucene/index/IndexWriter.
java,v
> > retrieving revision 1.15
> > diff -u -r1.15 IndexWriter.java
> > --- IndexWriter.java 15 Sep 2003 12:40:23 -0000 1.15
> > +++ IndexWriter.java 20 Sep 2003 12:22:13 -0000
> > @@ -249,6 +249,16 @@
> > *
> > * <p>This must never be less than 2. The default value is 10.*/
> > public int mergeFactor = 10;
> > +
> > + /** Determines the minimal number of documents required before
> > merging
> > + * and starting a new Segment. Since Documents are merged in a
> > + * {@link org.apache.lucene.store.RAMDirectory}, large value gives
> > faster
> > + * indexing. At the same time mergeFactor limits the number of
> > files open in
> > + * a FSDirectory.
> > + *
> > + * <p> The default value is 10.*/
> > + public int minMergeDocs = 10;
> > +
> >
> > /** Determines the largest number of documents ever merged by
> > addDocument().
> > * Small values (e.g., less than 10,000) are best for interactive
> > indexing,
> > @@ -316,7 +326,7 @@
> >
> > /** Incremental segment merger. */
> > private final void maybeMergeSegments() throws IOException {
> > - long targetMergeDocs = mergeFactor;
> > + long targetMergeDocs = minMergeDocs;
> > while (targetMergeDocs <= maxMergeDocs) {
> > // find segments smaller than current target size
> > int minSegment = segmentInfos.size();
> > >
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>
>
> __________________________________
> Do you Yahoo!?
> The New Yahoo! Shopping - with improved product search
> http://shopping.yahoo.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org