You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2020/09/28 08:33:06 UTC

[GitHub] [lucene-solr] s1monw opened a new pull request #1925: Cleanup DWPT state handling

s1monw opened a new pull request #1925:
URL: https://github.com/apache/lucene-solr/pull/1925


   DWPT currently has no real notion of a state but it's lifecycle really
   requires such a notion. We move DWPTs from active to flush pending to flushing
   and execute certain actions like RAM accounting based on these states. To simplify
   the transitions and the concurrency involved in it, it makes sense to formalize the
   transitions and if it can happen under lock or not.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] s1monw commented on a change in pull request #1925: Cleanup DWPT state handling

Posted by GitBox <gi...@apache.org>.
s1monw commented on a change in pull request #1925:
URL: https://github.com/apache/lucene-solr/pull/1925#discussion_r495778992



##########
File path: lucene/core/src/java/org/apache/lucene/index/DocumentsWriterPerThread.java
##########
@@ -593,9 +568,61 @@ void unlock() {
   }
 
   /**
-   * Returns <code>true</code> iff this DWPT has been flushed
+   * Returns the DWPTs current state.
    */
-  boolean hasFlushed() {
-    return hasFlushed.get() == Boolean.TRUE;
+  State getState() {
+    return state;
   }
+
+  /**
+   * Transitions the DWPT to the given state of fails if the transition is invalid.
+   * @throws IllegalStateException if the given state can not be transitioned to.
+   */
+  synchronized void transitionTo(State state) {
+    if (state.canTransitionFrom(this.state) == false) {
+      throw new IllegalStateException("Can't transition from " + this.state + " to " + state);
+    }
+    assert state.mustHoldLock == false || isHeldByCurrentThread() : "illegal state: " + state + " lock is held: " + isHeldByCurrentThread();
+    this.state = state;
+  }
+
+  /**
+   * Internal DWPT State.
+   */
+  enum State {
+    /**
+     * Default states when a DWPT is initialized and ready to index documents.
+     */
+    ACTIVE(null, true),
+    /**
+     * The DWPT can still index documents but should be moved to FLUSHING state as soon as possible.
+     * Transitions to this state can be done concurrently while another thread is actively indexing into this DWPT.
+     */
+    FLUSH_PENDING(ACTIVE, false),
+    /**
+     * The DWPT should not receive any further documents and is current flushing or queued up for flushing.
+     */
+    FLUSHING(FLUSH_PENDING, true),
+    /**
+     * The DWPT has been flushed and is ready to be garbage collected.
+     */
+    FLUSHED(FLUSHING, false);
+
+    private final State previousState;
+    final boolean mustHoldLock; // only for asserts
+
+    State(State previousState, boolean mustHoldLock) {

Review comment:
       I gotta solve the halting problem first many, I am on it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dweiss commented on a change in pull request #1925: Cleanup DWPT state handling

Posted by GitBox <gi...@apache.org>.
dweiss commented on a change in pull request #1925:
URL: https://github.com/apache/lucene-solr/pull/1925#discussion_r495780529



##########
File path: lucene/core/src/test/org/apache/lucene/index/TestIndexWriter.java
##########
@@ -2933,7 +2933,7 @@ public void testFlushLargestWriter() throws IOException, InterruptedException {
     int numRamDocs = w.numRamDocs();
     int numDocsInDWPT = largestNonPendingWriter.getNumDocsInRAM();
     assertTrue(w.flushNextBuffer());
-    assertTrue(largestNonPendingWriter.hasFlushed());
+    assertEquals(DocumentsWriterPerThread.State.FLUSHED, largestNonPendingWriter.getState());

Review comment:
       ok, fine it is then.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] s1monw commented on a change in pull request #1925: Cleanup DWPT state handling

Posted by GitBox <gi...@apache.org>.
s1monw commented on a change in pull request #1925:
URL: https://github.com/apache/lucene-solr/pull/1925#discussion_r496489937



##########
File path: lucene/core/src/java/org/apache/lucene/index/DocumentsWriterFlushControl.java
##########
@@ -157,25 +157,18 @@ private boolean updatePeaks(long delta) {
   }
 
   DocumentsWriterPerThread doAfterDocument(DocumentsWriterPerThread perThread, boolean isUpdate) {
-    final long delta = perThread.getCommitLastBytesUsedDelta();
+    final long delta = perThread.commitLastBytesUsed();
     synchronized (this) {
-      // we need to commit this under lock but calculate it outside of the lock to minimize the time this lock is held
-      // per document. The reason we update this under lock is that we mark DWPTs as pending without acquiring it's
-      // lock in #setFlushPending and this also reads the committed bytes and modifies the flush/activeBytes.
-      // In the future we can clean this up to be more intuitive.
-      perThread.commitLastBytesUsed(delta);
       try {
         /*
          * We need to differentiate here if we are pending since setFlushPending
          * moves the perThread memory to the flushBytes and we could be set to
          * pending during a delete
          */
-        if (perThread.isFlushPending()) {
-          flushBytes += delta;
-          assert updatePeaks(delta);
-        } else {
-          activeBytes += delta;
-          assert updatePeaks(delta);
+        activeBytes += delta;
+        assert updatePeaks(delta);
+        if (perThread.isFlushPending() == false) {
+          assert perThread.getState() == DocumentsWriterPerThread.State.ACTIVE : "expected ACTIVE state but was: " + perThread.getState();

Review comment:
       that makes sense to me.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] s1monw commented on pull request #1925: Cleanup DWPT state handling

Posted by GitBox <gi...@apache.org>.
s1monw commented on pull request #1925:
URL: https://github.com/apache/lucene-solr/pull/1925#issuecomment-699864240


   this is a followup from https://github.com/apache/lucene-solr/pull/1918


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dweiss commented on a change in pull request #1925: Cleanup DWPT state handling

Posted by GitBox <gi...@apache.org>.
dweiss commented on a change in pull request #1925:
URL: https://github.com/apache/lucene-solr/pull/1925#discussion_r495782001



##########
File path: lucene/core/src/java/org/apache/lucene/index/DocumentsWriterPerThread.java
##########
@@ -593,9 +568,61 @@ void unlock() {
   }
 
   /**
-   * Returns <code>true</code> iff this DWPT has been flushed
+   * Returns the DWPTs current state.
    */
-  boolean hasFlushed() {
-    return hasFlushed.get() == Boolean.TRUE;
+  State getState() {
+    return state;
   }
+
+  /**
+   * Transitions the DWPT to the given state of fails if the transition is invalid.
+   * @throws IllegalStateException if the given state can not be transitioned to.
+   */
+  synchronized void transitionTo(State state) {
+    if (state.canTransitionFrom(this.state) == false) {
+      throw new IllegalStateException("Can't transition from " + this.state + " to " + state);
+    }
+    assert state.mustHoldLock == false || isHeldByCurrentThread() : "illegal state: " + state + " lock is held: " + isHeldByCurrentThread();
+    this.state = state;
+  }
+
+  /**
+   * Internal DWPT State.
+   */
+  enum State {
+    /**
+     * Default states when a DWPT is initialized and ready to index documents.
+     */
+    ACTIVE(null, true),
+    /**
+     * The DWPT can still index documents but should be moved to FLUSHING state as soon as possible.
+     * Transitions to this state can be done concurrently while another thread is actively indexing into this DWPT.
+     */
+    FLUSH_PENDING(ACTIVE, false),
+    /**
+     * The DWPT should not receive any further documents and is current flushing or queued up for flushing.
+     */
+    FLUSHING(FLUSH_PENDING, true),
+    /**
+     * The DWPT has been flushed and is ready to be garbage collected.
+     */
+    FLUSHED(FLUSHING, false);
+
+    private final State previousState;
+    final boolean mustHoldLock; // only for asserts
+
+    State(State previousState, boolean mustHoldLock) {

Review comment:
       You have time until the end of this week [George Dantzig]




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dweiss commented on a change in pull request #1925: Cleanup DWPT state handling

Posted by GitBox <gi...@apache.org>.
dweiss commented on a change in pull request #1925:
URL: https://github.com/apache/lucene-solr/pull/1925#discussion_r495776845



##########
File path: lucene/core/src/test/org/apache/lucene/index/TestIndexWriter.java
##########
@@ -2933,7 +2933,7 @@ public void testFlushLargestWriter() throws IOException, InterruptedException {
     int numRamDocs = w.numRamDocs();
     int numDocsInDWPT = largestNonPendingWriter.getNumDocsInRAM();
     assertTrue(w.flushNextBuffer());
-    assertTrue(largestNonPendingWriter.hasFlushed());
+    assertEquals(DocumentsWriterPerThread.State.FLUSHED, largestNonPendingWriter.getState());

Review comment:
       Wouldn't this be more pleasant to read?
   assertTrue largestNonPendingWriter.inState(DocumentsWriterPerThread.State.FLUSHED)?

##########
File path: lucene/core/src/java/org/apache/lucene/index/DocumentsWriterPerThread.java
##########
@@ -593,9 +568,61 @@ void unlock() {
   }
 
   /**
-   * Returns <code>true</code> iff this DWPT has been flushed
+   * Returns the DWPTs current state.
    */
-  boolean hasFlushed() {
-    return hasFlushed.get() == Boolean.TRUE;
+  State getState() {
+    return state;
   }
+
+  /**
+   * Transitions the DWPT to the given state of fails if the transition is invalid.
+   * @throws IllegalStateException if the given state can not be transitioned to.
+   */
+  synchronized void transitionTo(State state) {
+    if (state.canTransitionFrom(this.state) == false) {
+      throw new IllegalStateException("Can't transition from " + this.state + " to " + state);
+    }
+    assert state.mustHoldLock == false || isHeldByCurrentThread() : "illegal state: " + state + " lock is held: " + isHeldByCurrentThread();
+    this.state = state;
+  }
+
+  /**
+   * Internal DWPT State.
+   */
+  enum State {
+    /**
+     * Default states when a DWPT is initialized and ready to index documents.
+     */
+    ACTIVE(null, true),
+    /**
+     * The DWPT can still index documents but should be moved to FLUSHING state as soon as possible.
+     * Transitions to this state can be done concurrently while another thread is actively indexing into this DWPT.
+     */
+    FLUSH_PENDING(ACTIVE, false),
+    /**
+     * The DWPT should not receive any further documents and is current flushing or queued up for flushing.
+     */
+    FLUSHING(FLUSH_PENDING, true),
+    /**
+     * The DWPT has been flushed and is ready to be garbage collected.
+     */
+    FLUSHED(FLUSHING, false);
+
+    private final State previousState;
+    final boolean mustHoldLock; // only for asserts
+
+    State(State previousState, boolean mustHoldLock) {

Review comment:
       Now you only need a formal correctness solver for this Petri net and you'll be good. 👍 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1925: Cleanup DWPT state handling

Posted by GitBox <gi...@apache.org>.
mikemccand commented on a change in pull request #1925:
URL: https://github.com/apache/lucene-solr/pull/1925#discussion_r496729477



##########
File path: lucene/core/src/java/org/apache/lucene/index/DocumentsWriterFlushControl.java
##########
@@ -157,49 +157,42 @@ private boolean updatePeaks(long delta) {
   }
 
   DocumentsWriterPerThread doAfterDocument(DocumentsWriterPerThread perThread, boolean isUpdate) {
-    final long delta = perThread.getCommitLastBytesUsedDelta();
+    final long delta = perThread.commitLastBytesUsed();
     synchronized (this) {
-      // we need to commit this under lock but calculate it outside of the lock to minimize the time this lock is held
-      // per document. The reason we update this under lock is that we mark DWPTs as pending without acquiring it's
-      // lock in #setFlushPending and this also reads the committed bytes and modifies the flush/activeBytes.
-      // In the future we can clean this up to be more intuitive.

Review comment:
       ;)

##########
File path: lucene/core/src/java/org/apache/lucene/index/DocumentsWriter.java
##########
@@ -446,7 +446,7 @@ long updateDocuments(final Iterable<? extends Iterable<? extends IndexableField>
   private boolean doFlush(DocumentsWriterPerThread flushingDWPT) throws IOException {
     boolean hasEvents = false;
     while (flushingDWPT != null) {
-      assert flushingDWPT.hasFlushed() == false;
+      assert flushingDWPT.getState() == DocumentsWriterPerThread.State.FLUSHING : "expected FLUSHING but was: " + flushingDWPT.getState();

Review comment:
       Maybe add a `boolean assertState(State)` to DWPT that does the above `assert` and then returns `true` so we can call it under `assert`?  Only downside is then we have double looking `assert`, i.e. `assert flushingDWPT.assertState(State.FLUSHING)`.  Looks like we are doing this in at least three places here ...

##########
File path: lucene/core/src/java/org/apache/lucene/index/DocumentsWriterFlushControl.java
##########
@@ -157,49 +157,42 @@ private boolean updatePeaks(long delta) {
   }
 
   DocumentsWriterPerThread doAfterDocument(DocumentsWriterPerThread perThread, boolean isUpdate) {
-    final long delta = perThread.getCommitLastBytesUsedDelta();
+    final long delta = perThread.commitLastBytesUsed();
     synchronized (this) {
-      // we need to commit this under lock but calculate it outside of the lock to minimize the time this lock is held
-      // per document. The reason we update this under lock is that we mark DWPTs as pending without acquiring it's
-      // lock in #setFlushPending and this also reads the committed bytes and modifies the flush/activeBytes.
-      // In the future we can clean this up to be more intuitive.
-      perThread.commitLastBytesUsed(delta);
       try {
         /*
          * We need to differentiate here if we are pending since setFlushPending
          * moves the perThread memory to the flushBytes and we could be set to
          * pending during a delete
          */
-        if (perThread.isFlushPending()) {
-          flushBytes += delta;
-          assert updatePeaks(delta);
-        } else {
-          activeBytes += delta;
-          assert updatePeaks(delta);
+        activeBytes += delta;

Review comment:
       Hmm, why are we removing the conditional on "flush pending" here (and always updating `activeBytes`, never `flushBytes`)?  I guess `setFlushPending` will move all `activeBytes` over to `flushBytes`, so it is fine for us to always increment `activeBytes` here?
   
   Edit: hmm it looks like we are also removing `setFlushPending`'s moving of `activeBytes` to `flushBytes` ;)  So now I don't quite understand how we are changing the RAM accounting here.
   
   Edit again!: OK I see, we moved the RAM shifting into `checkoutFlushableWriter`, OK good!
   
   Should we fix (or maybe just remove) the above comment now?

##########
File path: lucene/core/src/java/org/apache/lucene/index/DocumentsWriterFlushControl.java
##########
@@ -295,27 +289,32 @@ public synchronized void waitForFlush() {
    * {@link DocumentsWriterPerThread} must have indexed at least on Document and must not be
    * already pending.
    */
-  public synchronized void setFlushPending(DocumentsWriterPerThread perThread) {
-    assert !perThread.isFlushPending();
+  synchronized void setFlushPending(DocumentsWriterPerThread perThread) {
+    assert !perThread.isFlushPending() : "state: " + perThread.getState();
     if (perThread.getNumDocsInRAM() > 0) {
-      perThread.setFlushPending(); // write access synced
-      final long bytes = perThread.getLastCommittedBytesUsed();
-      flushBytes += bytes;
-      activeBytes -= bytes;

Review comment:
       Hmm we are no longer moving `bytes` from `activeBytes` to `flushedBytes` here?  Are we (the caller here?) doing that elsewhere?

##########
File path: lucene/core/src/java/org/apache/lucene/index/DocumentsWriterPerThread.java
##########
@@ -513,48 +515,21 @@ public String toString() {
         + numDocsInRAM + ", deleteQueue=" + deleteQueue + ", " + numDeletedDocIds + " deleted docIds" + "]";
   }
 
-
   /**
    * Returns true iff this DWPT is marked as flush pending
    */
   boolean isFlushPending() {
-    return flushPending.get() == Boolean.TRUE;
-  }
-
-  /**
-   * Sets this DWPT as flush pending. This can only be set once.
-   */
-  void setFlushPending() {
-    flushPending.set(Boolean.TRUE);
-  }
-
-
-  /**
-   * Returns the last committed bytes for this DWPT. This method can be called
-   * without acquiring the DWPTs lock.
-   */
-  long getLastCommittedBytesUsed() {
-    return lastCommittedBytesUsed;
-  }
-
-  /**
-   * Commits the current {@link #ramBytesUsed()} and stores it's value for later reuse.
-   * The last committed bytes used can be retrieved via {@link #getLastCommittedBytesUsed()}
-   */
-  void commitLastBytesUsed(long delta) {
-    assert isHeldByCurrentThread();
-    assert getCommitLastBytesUsedDelta() == delta : "delta has changed";
-    lastCommittedBytesUsed += delta;
+    return state == State.FLUSH_PENDING;

Review comment:
       Maybe we should remove `isFlushPending` method entirely and expect callers to check `state == State.FLUSH_PENDING` themselves?

##########
File path: lucene/core/src/java/org/apache/lucene/index/DocumentsWriterPerThread.java
##########
@@ -593,9 +568,61 @@ void unlock() {
   }
 
   /**
-   * Returns <code>true</code> iff this DWPT has been flushed
+   * Returns the DWPTs current state.
    */
-  boolean hasFlushed() {
-    return hasFlushed.get() == Boolean.TRUE;
+  State getState() {
+    return state;
   }
+
+  /**
+   * Transitions the DWPT to the given state of fails if the transition is invalid.

Review comment:
       s/`of`/`or`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] s1monw commented on a change in pull request #1925: Cleanup DWPT state handling

Posted by GitBox <gi...@apache.org>.
s1monw commented on a change in pull request #1925:
URL: https://github.com/apache/lucene-solr/pull/1925#discussion_r495779250



##########
File path: lucene/core/src/test/org/apache/lucene/index/TestIndexWriter.java
##########
@@ -2933,7 +2933,7 @@ public void testFlushLargestWriter() throws IOException, InterruptedException {
     int numRamDocs = w.numRamDocs();
     int numDocsInDWPT = largestNonPendingWriter.getNumDocsInRAM();
     assertTrue(w.flushNextBuffer());
-    assertTrue(largestNonPendingWriter.hasFlushed());
+    assertEquals(DocumentsWriterPerThread.State.FLUSHED, largestNonPendingWriter.getState());

Review comment:
       I guess it would. I am not sure if we should really add more methods just for tests?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] s1monw commented on pull request #1925: Cleanup DWPT state handling

Posted by GitBox <gi...@apache.org>.
s1monw commented on pull request #1925:
URL: https://github.com/apache/lucene-solr/pull/1925#issuecomment-699863956


   @mikemccand can you take a look at this?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dnhatn commented on a change in pull request #1925: Cleanup DWPT state handling

Posted by GitBox <gi...@apache.org>.
dnhatn commented on a change in pull request #1925:
URL: https://github.com/apache/lucene-solr/pull/1925#discussion_r496348288



##########
File path: lucene/core/src/java/org/apache/lucene/index/DocumentsWriterPerThread.java
##########
@@ -284,7 +283,7 @@ FrozenBufferedUpdates prepareFlush() {
 
   /** Flush all pending docs to a new segment */
   FlushedSegment flush(DocumentsWriter.FlushNotifications flushNotifications) throws IOException {
-    assert flushPending.get() == Boolean.TRUE;
+    assert state == State.FLUSHING;

Review comment:
       Maybe add the current state to the assertion.

##########
File path: lucene/core/src/java/org/apache/lucene/index/DocumentsWriterPerThread.java
##########
@@ -169,6 +167,7 @@ long updateDocuments(Iterable<? extends Iterable<? extends IndexableField>> docs
     try {
       testPoint("DocumentsWriterPerThread addDocuments start");
       assert abortingException == null: "DWPT has hit aborting exception but is still indexing";
+      assert state == State.ACTIVE || state == State.FLUSH_PENDING : "Illegal state: " + state + " must be ACTIVE of FLUSH_PENDING";

Review comment:
       nit: of -> or

##########
File path: lucene/core/src/java/org/apache/lucene/index/DocumentsWriterFlushControl.java
##########
@@ -157,25 +157,18 @@ private boolean updatePeaks(long delta) {
   }
 
   DocumentsWriterPerThread doAfterDocument(DocumentsWriterPerThread perThread, boolean isUpdate) {
-    final long delta = perThread.getCommitLastBytesUsedDelta();
+    final long delta = perThread.commitLastBytesUsed();
     synchronized (this) {
-      // we need to commit this under lock but calculate it outside of the lock to minimize the time this lock is held
-      // per document. The reason we update this under lock is that we mark DWPTs as pending without acquiring it's
-      // lock in #setFlushPending and this also reads the committed bytes and modifies the flush/activeBytes.
-      // In the future we can clean this up to be more intuitive.
-      perThread.commitLastBytesUsed(delta);
       try {
         /*
          * We need to differentiate here if we are pending since setFlushPending
          * moves the perThread memory to the flushBytes and we could be set to
          * pending during a delete
          */
-        if (perThread.isFlushPending()) {
-          flushBytes += delta;
-          assert updatePeaks(delta);
-        } else {
-          activeBytes += delta;
-          assert updatePeaks(delta);
+        activeBytes += delta;
+        assert updatePeaks(delta);
+        if (perThread.isFlushPending() == false) {
+          assert perThread.getState() == DocumentsWriterPerThread.State.ACTIVE : "expected ACTIVE state but was: " + perThread.getState();

Review comment:
       Do we still need `isFlushPending` method? Should we compare the state of `perThread` to ACTIVE or FLUSH_PENDING instead.

##########
File path: lucene/core/src/java/org/apache/lucene/index/DocumentsWriterPerThread.java
##########
@@ -593,9 +568,61 @@ void unlock() {
   }
 
   /**
-   * Returns <code>true</code> iff this DWPT has been flushed
+   * Returns the DWPTs current state.
    */
-  boolean hasFlushed() {
-    return hasFlushed.get() == Boolean.TRUE;
+  State getState() {
+    return state;
   }
+
+  /**
+   * Transitions the DWPT to the given state of fails if the transition is invalid.
+   * @throws IllegalStateException if the given state can not be transitioned to.
+   */
+  synchronized void transitionTo(State state) {
+    if (state.canTransitionFrom(this.state) == false) {
+      throw new IllegalStateException("Can't transition from " + this.state + " to " + state);
+    }
+    assert state.mustHoldLock == false || isHeldByCurrentThread() : "illegal state: " + state + " lock is held: " + isHeldByCurrentThread();
+    this.state = state;
+  }
+
+  /**
+   * Internal DWPT State.
+   */
+  enum State {
+    /**
+     * Default states when a DWPT is initialized and ready to index documents.
+     */
+    ACTIVE(null, true),
+    /**
+     * The DWPT can still index documents but should be moved to FLUSHING state as soon as possible.
+     * Transitions to this state can be done concurrently while another thread is actively indexing into this DWPT.
+     */
+    FLUSH_PENDING(ACTIVE, false),
+    /**
+     * The DWPT should not receive any further documents and is current flushing or queued up for flushing.
+     */
+    FLUSHING(FLUSH_PENDING, true),
+    /**
+     * The DWPT has been flushed and is ready to be garbage collected.
+     */
+    FLUSHED(FLUSHING, false);
+
+    private final State previousState;
+    final boolean mustHoldLock; // only for asserts

Review comment:
       This can be private?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org