You are viewing a plain text version of this content. The canonical link for it is here.

Posted to derby-dev@db.apache.org by Raymond Raymond <ra...@hotmail.com> on 2006/03/09 21:41:55 UTC

A question about the Clock.java!cleanCache() method

I read the source code of 
org.apache.derby.impl.services.cache.Clock.java!cleanCache(),
which writes out all of dirty pages. I got a question on the code.
The outline of the cleanCache() method is like:

protected void cleanCache(Matchable partialKey) throws StandardException {
  int position;

  synchronized(this)
  {
    position = holders.size() - 1;
  }
outerscan:
  for (;;) {
    CachedItem item = null;
    synchronized (this) {
      // the cache may have shrunk by quite a bit since we last came in here
      int size = holders.size();
      if (position >= size)
        position = size - 1;
innerscan:
     // go from position (the last cached item in the holder array to 0 (the 
first).
      for ( ;  position >= 0; position--, item = null) {
        if a valid dirty page is found
          break innerscan;
      }
    } // end of synchronized block

    if (position < 0){
      return;}

    try {
      clean the found dirty page
    } finally {
      release the found dirty page
    }
    position--;
  } // for (;;)
}

Under current implementation, when this method is accessed by multi-threads,
every threads will search from the end of the cache list to the beginning.
For instance, there are 10 (holder.size()) cache pages in the list and the 
8th,
7th, 6th pages are dirty. At runtime, we assume that there are 3 threads 
which
are accessing to this method(on the same Clock object). Since the variable
"position" is defined inside the method, each thread will keep an individual
copy of it(if "position" is a member of the class, all of the threads will 
share
the same copy of it, am I right here?). The method works like (no shrink, no
new dirty pages):
Thread1 comes in, it searches from 9 to 0 and will find the 8th is
dirty, and then break the innerscan loop to do the clean.
Thread2 comes in, it searches from 9 to 0 and will find the 7th is
dirty,and then break the innerscan loop to do the clean.
Thread3 comes in, it searches from 9 to 0 and will find the 6th is
dirty,and then break the innerscan loop to do the clean.

Each thread will search from 9 to 0. The problem it may cause is that
if derby is busy with updating, lots of new dirty pages may be generated 
after
the thread1 exist the synchronized code and before the thread2 entered the
synchronized code.So, the method will have more chance to find more and more
dirty pages to write out.

As what I said, if the "position" is a member of the class, all of the 
threads
will share the same copy of the variable. Then the method will works like:
Thread1 comes in, it searches from 9 to 0 and will find the 8th is
dirty, and then break the innerscan loop to do the clean (position = 8).
Thread2 comes in, it searches from 8 to 0 and will find the 7th is
dirty, and then break the innerscan loop to do the clean (position = 7).
Thread3 comes in, it searches from 7 to 0 and will find the 6th is
dirty, and then break the innerscan loop to do the clean (position = 6).

I am not sure which result is expected? The second one seems more efficient.



Thanks.


Raymond

_________________________________________________________________
Don't just Search. Find! http://search.sympatico.msn.ca/default.aspx The new 
MSN Search! Check it out!

Re: A question about the Clock.java!cleanCache() method

Posted by Mike Matrigali <mi...@sbcglobal.net>.

The normal case for the derby usage currently is that there is only
one thread which calls this routine, the background checkpoint
process.  So little has been done to worry about conncurrent execution.
There are a few other checkpoint cases, mostly due to recovery and
backup, but those are definitely not the normal case.

Having said that, your proposed implementation would break the contract
of that interface.  The call guarantees that ALL dirty pages, which are
dirty when the call gets made will be written out, it does not matter
that such a page was just written out prior to the call.  In your
case when
Thread2 comes in and if page 9 had been made dirty again after Thread1
wrote it then Thread2 would expect page9 to be flushed again, but it
wouldn't be under the change.  If thread2 is the background checkpoint
thread it is making the assumption that it can get rid of log records
associated with page9 because the changes have been flushed from disk,
but in this case that would not be true.

Raymond Raymond wrote:
> I read the source code of 
> org.apache.derby.impl.services.cache.Clock.java!cleanCache(),
> which writes out all of dirty pages. I got a question on the code.
> The outline of the cleanCache() method is like:
> 
> protected void cleanCache(Matchable partialKey) throws StandardException {
>  int position;
> 
>  synchronized(this)
>  {
>    position = holders.size() - 1;
>  }
> outerscan:
>  for (;;) {
>    CachedItem item = null;
>    synchronized (this) {
>      // the cache may have shrunk by quite a bit since we last came in here
>      int size = holders.size();
>      if (position >= size)
>        position = size - 1;
> innerscan:
>     // go from position (the last cached item in the holder array to 0 
> (the first).
>      for ( ;  position >= 0; position--, item = null) {
>        if a valid dirty page is found
>          break innerscan;
>      }
>    } // end of synchronized block
> 
>    if (position < 0){
>      return;}
> 
>    try {
>      clean the found dirty page
>    } finally {
>      release the found dirty page
>    }
>    position--;
>  } // for (;;)
> }
> 
> Under current implementation, when this method is accessed by 
> multi-threads,
> every threads will search from the end of the cache list to the beginning.
> For instance, there are 10 (holder.size()) cache pages in the list and 
> the 8th,
> 7th, 6th pages are dirty. At runtime, we assume that there are 3 threads 
> which
> are accessing to this method(on the same Clock object). Since the variable
> "position" is defined inside the method, each thread will keep an 
> individual
> copy of it(if "position" is a member of the class, all of the threads 
> will share
> the same copy of it, am I right here?). The method works like (no 
> shrink, no
> new dirty pages):
> Thread1 comes in, it searches from 9 to 0 and will find the 8th is
> dirty, and then break the innerscan loop to do the clean.
> Thread2 comes in, it searches from 9 to 0 and will find the 7th is
> dirty,and then break the innerscan loop to do the clean.
> Thread3 comes in, it searches from 9 to 0 and will find the 6th is
> dirty,and then break the innerscan loop to do the clean.
> 
> Each thread will search from 9 to 0. The problem it may cause is that
> if derby is busy with updating, lots of new dirty pages may be generated 
> after
> the thread1 exist the synchronized code and before the thread2 entered the
> synchronized code.So, the method will have more chance to find more and 
> more
> dirty pages to write out.
> 
> As what I said, if the "position" is a member of the class, all of the 
> threads
> will share the same copy of the variable. Then the method will works like:
> Thread1 comes in, it searches from 9 to 0 and will find the 8th is
> dirty, and then break the innerscan loop to do the clean (position = 8).
> Thread2 comes in, it searches from 8 to 0 and will find the 7th is
> dirty, and then break the innerscan loop to do the clean (position = 7).
> Thread3 comes in, it searches from 7 to 0 and will find the 6th is
> dirty, and then break the innerscan loop to do the clean (position = 6).
> 
> I am not sure which result is expected? The second one seems more 
> efficient.
> 
> 
> 
> Thanks.
> 
> 
> Raymond
> 
> _________________________________________________________________
> Don't just Search. Find! http://search.sympatico.msn.ca/default.aspx The 
> new MSN Search! Check it out!
> 
> 
>