You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@wicket.apache.org by Jan Vermeulen <ja...@isencia.com> on 2007/07/11 19:43:43 UTC

Thread synchronization problems in FilePageStore

As a consequence of 'random' errors in a single-page application using
Wicket, that always occurred after restoring a page from the FilePageStore,
I have been debugging the FilePageStore extensively.

I have extended the FilePageStoreTest with a method to simulate the errors
(see code included at the end of this post).

When does the error occur ?
I suppose the error is unlikely to occur in an application in which
navigation is between pages. But in a 'single-page' application, or an
application that allows for state changes within the same page, it is likely
to  occur. The error occurs when someone generates a new version of a page,
requests a previous version of that same page, and repeats that 'forward and
backward' movement. Since the page id remains the same, and the
versionNumber gets incremented and decremented, there is a case in which he
ends up with an incorrect restored page, i.e. a page that corresponds to a
previous 'forward and backward' movement: it has the same page id and
versionNumber, but not necessarily the same state.

What's really going wrong here ? 
Finally I found 2 'holes' in the thread synchronization between the
different threads that operate in the process of page storage. (1) is more
likely to happen than (2).

(1) The thread that saves the pages in a file, sleeps during 1 second
(before 2, in which the error occurred  more easily). Then it gets an
iterator on the 'pagesToBeSaved' map. For being a ConcurrentHashMap, that
iterator takes a snapshot of the contents of the map, and works on that
snapshot. For performance reasons, this process is does not lock the
'pagesToBeSaved' map. So someone can come in an store a new entry in the map
while the iterator is busy saving pages. If the entry happens to have the
same key as one that is already present, it simply replaces that entry.
Problem is now that the saving thread saves the cached (now obsolete) entry,
and finally removes the (updated) entry. So the last added entry never gets 
saved, and on the next retrieval, the wrong page will be returned, because
that's what's in the file.  

(2) The thread that serializes the pages correctly locks access to the list
of pages to be serialized for a given session, but when it has serialized a
page, it first removes that entry from the 'pagesToBeSerialized' list, and
only after that (without locking) adds that entry to the 'pagesToBeSaved'.
That leaves a small hole where the given entry is in neither of the 2 maps.
With some bad luck (it actually occurred in realtime), the thread that looks
for a page to restore it, might not find it in either of the two lists, and
thus decide to restore it from file. And since there was already some file
with that key, it ends up restoring the wrong page. This is a less likely
scenario to occur, but it can be easily provoked by adding a sleep() between
removal from one list and addition to the other to exagerate the hole.

Solutions to the problem 
With the following changes in FilePageStore, the holes can be tapped:

(1) change PageSavingThread to iterate the keys, and lookup actual the value
for each iteration. That way, we reduce the change of saving an obsolete
page if the 'pagesToBeSaved' map has been updated after the iterator has
been created. Advantage: no locking for the whole iteration process is
needed.

	Iterator it = pagesToBeSaved.keySet().iterator();
	while (it.hasNext())
	{
		synchronized (pagesToBeSaved) {
			SessionPageKey key = (SessionPageKey)pagesToBeSaved.get(it.next());
			if (key.data instanceof byte[])
			{
				savePage(key, (byte[])key.data);
			}
			pagesToBeSaved.remove(key);
		}
	}

(2) add a synchronize lock on the 'pagesToBeSaved' map for each single
iteration, to prevent that another thread can update an entry while it is
being saved and removed. The updates on 'pagesToBeSaved' must also lock
access to the map. This prevents the update of an entry in the map while the
saving thread is saving and removing the entry.

	synchronized (pagesToBeSaved) {
		pagesToBeSaved.put(key, key);
	}

(3) change the order in which the PageSerializingThread removes entries from
the 'pagesToBeSerialized' map and adds entries to the 'pagesToBeSaved' map:
if the entry is added to the 'pagesToBeSaved' before being removed from the
'pagesToBeSerialized', there can be no case in which it is neither of the
two lists. The fact that it might appear in both lists at a single moment,
does not affect the correct restore of pages in any way.

	synchronized (sessionList)
	{
		key.setObject(pageBytes);
		synchronized (pagesToBeSaved) {
			pagesToBeSaved.put(key, key);
		}
		sessionList.remove(key);
		sessionList.notifyAll();
	}

Applying these changes, I have been able to eliminate the errors that
occurred. I tried them out with various 'sleep' times in the unitTest.

Below is the code for the unitTest:

A simple TestPage to be used within the unitTest

package org.apache.wicket.protocol.http;

import org.apache.wicket.markup.html.WebPage;
import org.apache.wicket.model.Model;

class TestPage extends WebPage {
	private static final long serialVersionUID = 1L;
	
	public TestPage() {
		setModel(new Model());
	}
	
	public void setValue(int value) {
		// force a version change
		setModelObject(new Integer(value));
		// force end of version
		onDetach();
	}
	
	public int getValue() {
		return(((Integer)getModelObject()).intValue());
	}
	
	public boolean isBookmarkable() {
		return(false);
	}
	
	public boolean isVersioned() {
		return(true);
	}
}

the extended FilePageStoreTest

package org.apache.wicket.protocol.http;

import org.apache.wicket.Application;
import org.apache.wicket.IPageMap;
import org.apache.wicket.Session;
import org.apache.wicket.WicketTestCase;
import
org.apache.wicket.protocol.http.SecondLevelCacheSessionStore.IPageStore;

/**
 * @author jcompagner
 */
public class FilePageStoreTest extends WicketTestCase
{
	...

	private void sleep(int time) {
		try
		{
			Thread.sleep(time);
		}
		catch (InterruptedException e)
		{
			throw new RuntimeException(e);
		}
	}
	
	/**
	 * This method simulates a situation in which 2 versions of a page are
created,
	 * and 2 subsequent 'backs' restore the previous versions of that page,
	 * ending in the restore of the original page.
	 *
	 * When repeating this process a number of times, and if the response times
are
	 * below the cycle times of the serialization and save threads of the
FilePageStore,
	 * synchronization errors occur: since the version numbers overlap between
the loops
	 * (version 1 for loop 2 is stored with the same key as version 1 for loop
1), 
	 * there is a case where the wrong page is restored: it returns the page
with
	 * the same id and versionNumber from a previous loop.
	 * 
	 * To detect that problem, a value unique to the loop is added to the page,
	 * and when restored, it is checked if the correct value is in the restored
page.
	 *
	 * Time between requests and number of loops can be variated to simulate
the problem.
	 */
	public void testRestorePage() {
		// time to wait between subsequent 'requests' (miliseconds)
		int time = 5;
		
		// number of loops
		int loops = 100;

		// creation of a non-bookmarkable and versioned page
		TestPage page = new TestPage();
		
		tester.setupRequestAndResponse();
		Session session = Session.get();

		IPageStore pageStore =
((SecondLevelCacheSessionStore)Application.get().getSessionStore()).getStore();

		// pagemap is needed for storage
		// (without that, the versionManager never increments the versionNumber
		//  because getLastVersion() is based on a cache in the pageMap)
		IPageMap pageMap = page.getPageMap();

		// store the original version 0
		page.setValue(0);
		pageMap.put(page);
		
		sleep(time);
		
		for (int i = 1; i <= loops; i++) {
			// first requestCycle: change value & store version 1 
			pageStore.pageAccessed(session.getId(),page);
			page.setValue(i);
			pageMap.put(page);
			
			sleep(time);

			// second requestCycle: change value & store version 2 
			pageStore.pageAccessed(session.getId(),page);
			page.setValue(0);
			pageMap.put(page);
			
			sleep(time);

			// third requestCycle: restore version 1 (expected value: i)
			pageStore.pageAccessed(session.getId(),page);
			page = (TestPage)pageStore.getPage(session.getId(),null,0,1,0);
			assertNotNull("loop "+i+": no page found for version 1",page);
			assertEquals("loop "+i+": incorrect page restore for version
1:",i,page.getValue());
			pageMap.put(page);
			
			sleep(time);
			
			// fourth requestCycle: restore version 0 (expected value: 0)
			pageStore.pageAccessed(session.getId(),page);
			page = (TestPage)pageStore.getPage(session.getId(),null,0,0,0);
			assertNotNull("loop "+i+": no page found for version 0",page);
			assertEquals("loop "+i+": incorrect page restore for version
0:",0,page.getValue());
			pageMap.put(page);
			
			sleep(time);
		}
	}
-- 
View this message in context: http://www.nabble.com/Thread-synchronization-problems-in-FilePageStore-tf4063498.html#a11545253
Sent from the Wicket - Dev mailing list archive at Nabble.com.


Re: Thread synchronization problems in FilePageStore

Posted by Jan Vermeulen <ja...@isencia.com>.

Matej Knopp-2 wrote:
> 
> Well, it's very new and not very tested. Therefore it's
> "experimental". It might work well but there might also be issues. It
> would help a lot if you tried using it and if you come across any
> problems let me know.
> 
> -Matej
> 

We'll try it out in our TRUNK, and stick to a 'fixed' FilePageStore for the
branches.
Any problems we'll surely report.

Jan.
-- 
View this message in context: http://www.nabble.com/Thread-synchronization-problems-in-FilePageStore-tf4063498.html#a11558992
Sent from the Wicket - Dev mailing list archive at Nabble.com.


Re: Thread synchronization problems in FilePageStore

Posted by Johan Compagner <jc...@gmail.com>.
i will also look more closely to this the comming weekend.
i guess we can drop the filestore if we can't find much wrong with the
diskpagestore

johan


On 7/12/07, Matej Knopp <ma...@gmail.com> wrote:
>
> Well, it's very new and not very tested. Therefore it's
> "experimental". It might work well but there might also be issues. It
> would help a lot if you tried using it and if you come across any
> problems let me know.
>
> -Matej
>
> On 7/12/07, Jan Vermeulen <ja...@isencia.com> wrote:
> >
> >
> > Matej Knopp-2 wrote:
> > >
> > > Hi, there's a new experimental DiskPageStore in Wicket trunk, would
> > > you mind checking if it suffers from the same problem?
> > >
> >
> > I ran the test on the new DiskPageStore with various sleep-times and up
> to
> > 1000 loops, and it never failed.
> >
> > I see it is storing various pages in a single file, and uses a
> > FileChannelPool to improve performance. Can we safely use the new
> > DiskPageStore as an alternative for FilePageStore ?
> >
> > Jan.
> > --
> > View this message in context:
> http://www.nabble.com/Thread-synchronization-problems-in-FilePageStore-tf4063498.html#a11556698
> > Sent from the Wicket - Dev mailing list archive at Nabble.com.
> >
> >
>

Re: Thread synchronization problems in FilePageStore

Posted by Matej Knopp <ma...@gmail.com>.
Well, it's very new and not very tested. Therefore it's
"experimental". It might work well but there might also be issues. It
would help a lot if you tried using it and if you come across any
problems let me know.

-Matej

On 7/12/07, Jan Vermeulen <ja...@isencia.com> wrote:
>
>
> Matej Knopp-2 wrote:
> >
> > Hi, there's a new experimental DiskPageStore in Wicket trunk, would
> > you mind checking if it suffers from the same problem?
> >
>
> I ran the test on the new DiskPageStore with various sleep-times and up to
> 1000 loops, and it never failed.
>
> I see it is storing various pages in a single file, and uses a
> FileChannelPool to improve performance. Can we safely use the new
> DiskPageStore as an alternative for FilePageStore ?
>
> Jan.
> --
> View this message in context: http://www.nabble.com/Thread-synchronization-problems-in-FilePageStore-tf4063498.html#a11556698
> Sent from the Wicket - Dev mailing list archive at Nabble.com.
>
>

Re: Thread synchronization problems in FilePageStore

Posted by Jan Vermeulen <ja...@isencia.com>.

Matej Knopp-2 wrote:
> 
> Hi, there's a new experimental DiskPageStore in Wicket trunk, would
> you mind checking if it suffers from the same problem?
> 

I ran the test on the new DiskPageStore with various sleep-times and up to
1000 loops, and it never failed.

I see it is storing various pages in a single file, and uses a
FileChannelPool to improve performance. Can we safely use the new
DiskPageStore as an alternative for FilePageStore ?

Jan.
-- 
View this message in context: http://www.nabble.com/Thread-synchronization-problems-in-FilePageStore-tf4063498.html#a11556698
Sent from the Wicket - Dev mailing list archive at Nabble.com.


Re: Thread synchronization problems in FilePageStore

Posted by Eelco Hillenius <ee...@gmail.com>.
I opened an issue here http://issues.apache.org/jira/browse/WICKET-746

Eelco

Re: Thread synchronization problems in FilePageStore

Posted by Matej Knopp <ma...@gmail.com>.
Hi, there's a new experimental DiskPageStore in Wicket trunk, would
you mind checking if it suffers from the same problem?

-Matej

On 7/11/07, Jan Vermeulen <ja...@isencia.com> wrote:
>
> As a consequence of 'random' errors in a single-page application using
> Wicket, that always occurred after restoring a page from the FilePageStore,
> I have been debugging the FilePageStore extensively.
>
> I have extended the FilePageStoreTest with a method to simulate the errors
> (see code included at the end of this post).
>
> When does the error occur ?
> I suppose the error is unlikely to occur in an application in which
> navigation is between pages. But in a 'single-page' application, or an
> application that allows for state changes within the same page, it is likely
> to  occur. The error occurs when someone generates a new version of a page,
> requests a previous version of that same page, and repeats that 'forward and
> backward' movement. Since the page id remains the same, and the
> versionNumber gets incremented and decremented, there is a case in which he
> ends up with an incorrect restored page, i.e. a page that corresponds to a
> previous 'forward and backward' movement: it has the same page id and
> versionNumber, but not necessarily the same state.
>
> What's really going wrong here ?
> Finally I found 2 'holes' in the thread synchronization between the
> different threads that operate in the process of page storage. (1) is more
> likely to happen than (2).
>
> (1) The thread that saves the pages in a file, sleeps during 1 second
> (before 2, in which the error occurred  more easily). Then it gets an
> iterator on the 'pagesToBeSaved' map. For being a ConcurrentHashMap, that
> iterator takes a snapshot of the contents of the map, and works on that
> snapshot. For performance reasons, this process is does not lock the
> 'pagesToBeSaved' map. So someone can come in an store a new entry in the map
> while the iterator is busy saving pages. If the entry happens to have the
> same key as one that is already present, it simply replaces that entry.
> Problem is now that the saving thread saves the cached (now obsolete) entry,
> and finally removes the (updated) entry. So the last added entry never gets
> saved, and on the next retrieval, the wrong page will be returned, because
> that's what's in the file.
>
> (2) The thread that serializes the pages correctly locks access to the list
> of pages to be serialized for a given session, but when it has serialized a
> page, it first removes that entry from the 'pagesToBeSerialized' list, and
> only after that (without locking) adds that entry to the 'pagesToBeSaved'.
> That leaves a small hole where the given entry is in neither of the 2 maps.
> With some bad luck (it actually occurred in realtime), the thread that looks
> for a page to restore it, might not find it in either of the two lists, and
> thus decide to restore it from file. And since there was already some file
> with that key, it ends up restoring the wrong page. This is a less likely
> scenario to occur, but it can be easily provoked by adding a sleep() between
> removal from one list and addition to the other to exagerate the hole.
>
> Solutions to the problem
> With the following changes in FilePageStore, the holes can be tapped:
>
> (1) change PageSavingThread to iterate the keys, and lookup actual the value
> for each iteration. That way, we reduce the change of saving an obsolete
> page if the 'pagesToBeSaved' map has been updated after the iterator has
> been created. Advantage: no locking for the whole iteration process is
> needed.
>
>         Iterator it = pagesToBeSaved.keySet().iterator();
>         while (it.hasNext())
>         {
>                 synchronized (pagesToBeSaved) {
>                         SessionPageKey key = (SessionPageKey)pagesToBeSaved.get(it.next());
>                         if (key.data instanceof byte[])
>                         {
>                                 savePage(key, (byte[])key.data);
>                         }
>                         pagesToBeSaved.remove(key);
>                 }
>         }
>
> (2) add a synchronize lock on the 'pagesToBeSaved' map for each single
> iteration, to prevent that another thread can update an entry while it is
> being saved and removed. The updates on 'pagesToBeSaved' must also lock
> access to the map. This prevents the update of an entry in the map while the
> saving thread is saving and removing the entry.
>
>         synchronized (pagesToBeSaved) {
>                 pagesToBeSaved.put(key, key);
>         }
>
> (3) change the order in which the PageSerializingThread removes entries from
> the 'pagesToBeSerialized' map and adds entries to the 'pagesToBeSaved' map:
> if the entry is added to the 'pagesToBeSaved' before being removed from the
> 'pagesToBeSerialized', there can be no case in which it is neither of the
> two lists. The fact that it might appear in both lists at a single moment,
> does not affect the correct restore of pages in any way.
>
>         synchronized (sessionList)
>         {
>                 key.setObject(pageBytes);
>                 synchronized (pagesToBeSaved) {
>                         pagesToBeSaved.put(key, key);
>                 }
>                 sessionList.remove(key);
>                 sessionList.notifyAll();
>         }
>
> Applying these changes, I have been able to eliminate the errors that
> occurred. I tried them out with various 'sleep' times in the unitTest.
>
> Below is the code for the unitTest:
>
> A simple TestPage to be used within the unitTest
>
> package org.apache.wicket.protocol.http;
>
> import org.apache.wicket.markup.html.WebPage;
> import org.apache.wicket.model.Model;
>
> class TestPage extends WebPage {
>         private static final long serialVersionUID = 1L;
>
>         public TestPage() {
>                 setModel(new Model());
>         }
>
>         public void setValue(int value) {
>                 // force a version change
>                 setModelObject(new Integer(value));
>                 // force end of version
>                 onDetach();
>         }
>
>         public int getValue() {
>                 return(((Integer)getModelObject()).intValue());
>         }
>
>         public boolean isBookmarkable() {
>                 return(false);
>         }
>
>         public boolean isVersioned() {
>                 return(true);
>         }
> }
>
> the extended FilePageStoreTest
>
> package org.apache.wicket.protocol.http;
>
> import org.apache.wicket.Application;
> import org.apache.wicket.IPageMap;
> import org.apache.wicket.Session;
> import org.apache.wicket.WicketTestCase;
> import
> org.apache.wicket.protocol.http.SecondLevelCacheSessionStore.IPageStore;
>
> /**
>  * @author jcompagner
>  */
> public class FilePageStoreTest extends WicketTestCase
> {
>         ...
>
>         private void sleep(int time) {
>                 try
>                 {
>                         Thread.sleep(time);
>                 }
>                 catch (InterruptedException e)
>                 {
>                         throw new RuntimeException(e);
>                 }
>         }
>
>         /**
>          * This method simulates a situation in which 2 versions of a page are
> created,
>          * and 2 subsequent 'backs' restore the previous versions of that page,
>          * ending in the restore of the original page.
>          *
>          * When repeating this process a number of times, and if the response times
> are
>          * below the cycle times of the serialization and save threads of the
> FilePageStore,
>          * synchronization errors occur: since the version numbers overlap between
> the loops
>          * (version 1 for loop 2 is stored with the same key as version 1 for loop
> 1),
>          * there is a case where the wrong page is restored: it returns the page
> with
>          * the same id and versionNumber from a previous loop.
>          *
>          * To detect that problem, a value unique to the loop is added to the page,
>          * and when restored, it is checked if the correct value is in the restored
> page.
>          *
>          * Time between requests and number of loops can be variated to simulate
> the problem.
>          */
>         public void testRestorePage() {
>                 // time to wait between subsequent 'requests' (miliseconds)
>                 int time = 5;
>
>                 // number of loops
>                 int loops = 100;
>
>                 // creation of a non-bookmarkable and versioned page
>                 TestPage page = new TestPage();
>
>                 tester.setupRequestAndResponse();
>                 Session session = Session.get();
>
>                 IPageStore pageStore =
> ((SecondLevelCacheSessionStore)Application.get().getSessionStore()).getStore();
>
>                 // pagemap is needed for storage
>                 // (without that, the versionManager never increments the versionNumber
>                 //  because getLastVersion() is based on a cache in the pageMap)
>                 IPageMap pageMap = page.getPageMap();
>
>                 // store the original version 0
>                 page.setValue(0);
>                 pageMap.put(page);
>
>                 sleep(time);
>
>                 for (int i = 1; i <= loops; i++) {
>                         // first requestCycle: change value & store version 1
>                         pageStore.pageAccessed(session.getId(),page);
>                         page.setValue(i);
>                         pageMap.put(page);
>
>                         sleep(time);
>
>                         // second requestCycle: change value & store version 2
>                         pageStore.pageAccessed(session.getId(),page);
>                         page.setValue(0);
>                         pageMap.put(page);
>
>                         sleep(time);
>
>                         // third requestCycle: restore version 1 (expected value: i)
>                         pageStore.pageAccessed(session.getId(),page);
>                         page = (TestPage)pageStore.getPage(session.getId(),null,0,1,0);
>                         assertNotNull("loop "+i+": no page found for version 1",page);
>                         assertEquals("loop "+i+": incorrect page restore for version
> 1:",i,page.getValue());
>                         pageMap.put(page);
>
>                         sleep(time);
>
>                         // fourth requestCycle: restore version 0 (expected value: 0)
>                         pageStore.pageAccessed(session.getId(),page);
>                         page = (TestPage)pageStore.getPage(session.getId(),null,0,0,0);
>                         assertNotNull("loop "+i+": no page found for version 0",page);
>                         assertEquals("loop "+i+": incorrect page restore for version
> 0:",0,page.getValue());
>                         pageMap.put(page);
>
>                         sleep(time);
>                 }
>         }
> --
> View this message in context: http://www.nabble.com/Thread-synchronization-problems-in-FilePageStore-tf4063498.html#a11545253
> Sent from the Wicket - Dev mailing list archive at Nabble.com.
>
>

Re: Thread synchronization problems in FilePageStore

Posted by Johan Compagner <jc...@gmail.com>.
loads of fixes and improvements happened in that area.
you really need to upgrade to the latest 1.3.x release

johan


On Thu, May 15, 2008 at 10:21 PM, rajdhan <rd...@bluefishgroup.com>
wrote:

>
> We would have definitely tried the upgrade if we could reproduce this issue
> in our Dev/Test environment, unfortunately this happens only in production.
>
> Do you have a reason to believe that this is fixed in the latest version of
> Wicket?
>
>
> Matej Knopp-2 wrote:
> >
> > On Thu, May 15, 2008 at 10:00 PM, rajdhan <rd...@bluefishgroup.com>
> > wrote:
> >>
> >> I dont understand your question?
> >>
> >> Yes I am serious! Why do you ask? Was there something that's supposed to
> >> be
> >> funny, that I am not catching?
> >
> > I don't know, was there? You are using beta release that is over 10
> > months old for production. And you are complaining about a problem
> > without trying the most recent version of Wicket 1.3 first.
> >
> > -Matej
> >
> >>
> >>
> >>
> >> Matej Knopp-2 wrote:
> >>>
> >>>>
> >>>> We are using Wicket 1.3.0 beta 2.
> >>>
> >>> Are you serious?
> >>>
> >>> -Matej
> >>>
> >>>
> >>
> >> --
> >> View this message in context:
> >>
> http://www.nabble.com/Thread-synchronization-problems-in-FilePageStore-tp11545253p17261201.html
> >> Sent from the Wicket - Dev mailing list archive at Nabble.com.
> >>
> >>
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/Thread-synchronization-problems-in-FilePageStore-tp11545253p17261614.html
> Sent from the Wicket - Dev mailing list archive at Nabble.com.
>
>

Re: Thread synchronization problems in FilePageStore

Posted by rajdhan <rd...@bluefishgroup.com>.
We would have definitely tried the upgrade if we could reproduce this issue
in our Dev/Test environment, unfortunately this happens only in production.

Do you have a reason to believe that this is fixed in the latest version of
Wicket?


Matej Knopp-2 wrote:
> 
> On Thu, May 15, 2008 at 10:00 PM, rajdhan <rd...@bluefishgroup.com>
> wrote:
>>
>> I dont understand your question?
>>
>> Yes I am serious! Why do you ask? Was there something that's supposed to
>> be
>> funny, that I am not catching?
> 
> I don't know, was there? You are using beta release that is over 10
> months old for production. And you are complaining about a problem
> without trying the most recent version of Wicket 1.3 first.
> 
> -Matej
> 
>>
>>
>>
>> Matej Knopp-2 wrote:
>>>
>>>>
>>>> We are using Wicket 1.3.0 beta 2.
>>>
>>> Are you serious?
>>>
>>> -Matej
>>>
>>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Thread-synchronization-problems-in-FilePageStore-tp11545253p17261201.html
>> Sent from the Wicket - Dev mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: http://www.nabble.com/Thread-synchronization-problems-in-FilePageStore-tp11545253p17261614.html
Sent from the Wicket - Dev mailing list archive at Nabble.com.


Re: Thread synchronization problems in FilePageStore

Posted by Matej Knopp <ma...@gmail.com>.
On Thu, May 15, 2008 at 10:00 PM, rajdhan <rd...@bluefishgroup.com> wrote:
>
> I dont understand your question?
>
> Yes I am serious! Why do you ask? Was there something that's supposed to be
> funny, that I am not catching?

I don't know, was there? You are using beta release that is over 10
months old for production. And you are complaining about a problem
without trying the most recent version of Wicket 1.3 first.

-Matej

>
>
>
> Matej Knopp-2 wrote:
>>
>>>
>>> We are using Wicket 1.3.0 beta 2.
>>
>> Are you serious?
>>
>> -Matej
>>
>>
>
> --
> View this message in context: http://www.nabble.com/Thread-synchronization-problems-in-FilePageStore-tp11545253p17261201.html
> Sent from the Wicket - Dev mailing list archive at Nabble.com.
>
>

Re: Thread synchronization problems in FilePageStore

Posted by rajdhan <rd...@bluefishgroup.com>.
I dont understand your question?

Yes I am serious! Why do you ask? Was there something that's supposed to be
funny, that I am not catching?



Matej Knopp-2 wrote:
> 
>>
>> We are using Wicket 1.3.0 beta 2.
> 
> Are you serious?
> 
> -Matej
> 
> 

-- 
View this message in context: http://www.nabble.com/Thread-synchronization-problems-in-FilePageStore-tp11545253p17261201.html
Sent from the Wicket - Dev mailing list archive at Nabble.com.


Re: Thread synchronization problems in FilePageStore

Posted by Matej Knopp <ma...@gmail.com>.
>
> We are using Wicket 1.3.0 beta 2.

Are you serious?

-Matej

Re: Thread synchronization problems in FilePageStore

Posted by rajdhan <rd...@bluefishgroup.com>.
We are seeing some strange behavior in our Prod environment where we
occasionally see PageExpiredException, even though the user is actively
using the site.

Below is the stack trace
Cannot find the rendered page in session
[pagemap=null,componentPath=0:lpContent:pageHeaderBoltOnListerPanel:placeHolderBoltOns:0:placeHolderBoltOn:editModalLink,versionNumber=0]


Stack trace level: 10
class org.apache.wicket.protocol.http.PageExpiredException
Cannot find the rendered page in session
[pagemap=null,componentPath=0:lpContent:pageHeaderBoltOnListerPanel:placeHolderBoltOns:0:placeHolderBoltOn:editModalLink,versionNumber=0]
org.apache.wicket.request.AbstractRequestCycleProcessor:resolveRenderedPage(451)
org.apache.wicket.protocol.http.WebRequestCycleProcessor:resolve(139)
org.apache.wicket.RequestCycle:step(1090)
org.apache.wicket.RequestCycle:steps(1176)
org.apache.wicket.RequestCycle:request(499)
org.apache.wicket.protocol.http.WicketFilter:doGet(257)
org.apache.wicket.protocol.http.WicketFilter:doFilter(127)

The reason I am posting it here is that, when I was trying to debug into the
issue, there is a call made to the PageStore to get the Page, which involves
the FilePageStore.

I was wondering if the Thread Synchronization problems on the FilePageStore
are causing our Unexpected "SessionExpired" messages.

When I discussed with the users they seem to indicate when that this happens
when they are trying to click on links that open Modal windows Or close
them, or other Ajaxy links, which essentially leaves the user on the same
page. I think I read in the initial post that the Thread Synchronization
problem could be an issue in single page applications.

We are using Wicket 1.3.0 beta 2.

Any thoughts?
-- 
View this message in context: http://www.nabble.com/Thread-synchronization-problems-in-FilePageStore-tp11545253p17260717.html
Sent from the Wicket - Dev mailing list archive at Nabble.com.


Re: Thread synchronization problems in FilePageStore

Posted by Matej Knopp <ma...@gmail.com>.
Btw, I've tried the test you provided with both FilePageStore and
DiskPageStore. It failed with FilePageStore but passed with
DiskPageStore.

-Matej

On 7/11/07, Jan Vermeulen <ja...@isencia.com> wrote:
>
> As a consequence of 'random' errors in a single-page application using
> Wicket, that always occurred after restoring a page from the FilePageStore,
> I have been debugging the FilePageStore extensively.
>
> I have extended the FilePageStoreTest with a method to simulate the errors
> (see code included at the end of this post).
>
> When does the error occur ?
> I suppose the error is unlikely to occur in an application in which
> navigation is between pages. But in a 'single-page' application, or an
> application that allows for state changes within the same page, it is likely
> to  occur. The error occurs when someone generates a new version of a page,
> requests a previous version of that same page, and repeats that 'forward and
> backward' movement. Since the page id remains the same, and the
> versionNumber gets incremented and decremented, there is a case in which he
> ends up with an incorrect restored page, i.e. a page that corresponds to a
> previous 'forward and backward' movement: it has the same page id and
> versionNumber, but not necessarily the same state.
>
> What's really going wrong here ?
> Finally I found 2 'holes' in the thread synchronization between the
> different threads that operate in the process of page storage. (1) is more
> likely to happen than (2).
>
> (1) The thread that saves the pages in a file, sleeps during 1 second
> (before 2, in which the error occurred  more easily). Then it gets an
> iterator on the 'pagesToBeSaved' map. For being a ConcurrentHashMap, that
> iterator takes a snapshot of the contents of the map, and works on that
> snapshot. For performance reasons, this process is does not lock the
> 'pagesToBeSaved' map. So someone can come in an store a new entry in the map
> while the iterator is busy saving pages. If the entry happens to have the
> same key as one that is already present, it simply replaces that entry.
> Problem is now that the saving thread saves the cached (now obsolete) entry,
> and finally removes the (updated) entry. So the last added entry never gets
> saved, and on the next retrieval, the wrong page will be returned, because
> that's what's in the file.
>
> (2) The thread that serializes the pages correctly locks access to the list
> of pages to be serialized for a given session, but when it has serialized a
> page, it first removes that entry from the 'pagesToBeSerialized' list, and
> only after that (without locking) adds that entry to the 'pagesToBeSaved'.
> That leaves a small hole where the given entry is in neither of the 2 maps.
> With some bad luck (it actually occurred in realtime), the thread that looks
> for a page to restore it, might not find it in either of the two lists, and
> thus decide to restore it from file. And since there was already some file
> with that key, it ends up restoring the wrong page. This is a less likely
> scenario to occur, but it can be easily provoked by adding a sleep() between
> removal from one list and addition to the other to exagerate the hole.
>
> Solutions to the problem
> With the following changes in FilePageStore, the holes can be tapped:
>
> (1) change PageSavingThread to iterate the keys, and lookup actual the value
> for each iteration. That way, we reduce the change of saving an obsolete
> page if the 'pagesToBeSaved' map has been updated after the iterator has
> been created. Advantage: no locking for the whole iteration process is
> needed.
>
>         Iterator it = pagesToBeSaved.keySet().iterator();
>         while (it.hasNext())
>         {
>                 synchronized (pagesToBeSaved) {
>                         SessionPageKey key = (SessionPageKey)pagesToBeSaved.get(it.next());
>                         if (key.data instanceof byte[])
>                         {
>                                 savePage(key, (byte[])key.data);
>                         }
>                         pagesToBeSaved.remove(key);
>                 }
>         }
>
> (2) add a synchronize lock on the 'pagesToBeSaved' map for each single
> iteration, to prevent that another thread can update an entry while it is
> being saved and removed. The updates on 'pagesToBeSaved' must also lock
> access to the map. This prevents the update of an entry in the map while the
> saving thread is saving and removing the entry.
>
>         synchronized (pagesToBeSaved) {
>                 pagesToBeSaved.put(key, key);
>         }
>
> (3) change the order in which the PageSerializingThread removes entries from
> the 'pagesToBeSerialized' map and adds entries to the 'pagesToBeSaved' map:
> if the entry is added to the 'pagesToBeSaved' before being removed from the
> 'pagesToBeSerialized', there can be no case in which it is neither of the
> two lists. The fact that it might appear in both lists at a single moment,
> does not affect the correct restore of pages in any way.
>
>         synchronized (sessionList)
>         {
>                 key.setObject(pageBytes);
>                 synchronized (pagesToBeSaved) {
>                         pagesToBeSaved.put(key, key);
>                 }
>                 sessionList.remove(key);
>                 sessionList.notifyAll();
>         }
>
> Applying these changes, I have been able to eliminate the errors that
> occurred. I tried them out with various 'sleep' times in the unitTest.
>
> Below is the code for the unitTest:
>
> A simple TestPage to be used within the unitTest
>
> package org.apache.wicket.protocol.http;
>
> import org.apache.wicket.markup.html.WebPage;
> import org.apache.wicket.model.Model;
>
> class TestPage extends WebPage {
>         private static final long serialVersionUID = 1L;
>
>         public TestPage() {
>                 setModel(new Model());
>         }
>
>         public void setValue(int value) {
>                 // force a version change
>                 setModelObject(new Integer(value));
>                 // force end of version
>                 onDetach();
>         }
>
>         public int getValue() {
>                 return(((Integer)getModelObject()).intValue());
>         }
>
>         public boolean isBookmarkable() {
>                 return(false);
>         }
>
>         public boolean isVersioned() {
>                 return(true);
>         }
> }
>
> the extended FilePageStoreTest
>
> package org.apache.wicket.protocol.http;
>
> import org.apache.wicket.Application;
> import org.apache.wicket.IPageMap;
> import org.apache.wicket.Session;
> import org.apache.wicket.WicketTestCase;
> import
> org.apache.wicket.protocol.http.SecondLevelCacheSessionStore.IPageStore;
>
> /**
>  * @author jcompagner
>  */
> public class FilePageStoreTest extends WicketTestCase
> {
>         ...
>
>         private void sleep(int time) {
>                 try
>                 {
>                         Thread.sleep(time);
>                 }
>                 catch (InterruptedException e)
>                 {
>                         throw new RuntimeException(e);
>                 }
>         }
>
>         /**
>          * This method simulates a situation in which 2 versions of a page are
> created,
>          * and 2 subsequent 'backs' restore the previous versions of that page,
>          * ending in the restore of the original page.
>          *
>          * When repeating this process a number of times, and if the response times
> are
>          * below the cycle times of the serialization and save threads of the
> FilePageStore,
>          * synchronization errors occur: since the version numbers overlap between
> the loops
>          * (version 1 for loop 2 is stored with the same key as version 1 for loop
> 1),
>          * there is a case where the wrong page is restored: it returns the page
> with
>          * the same id and versionNumber from a previous loop.
>          *
>          * To detect that problem, a value unique to the loop is added to the page,
>          * and when restored, it is checked if the correct value is in the restored
> page.
>          *
>          * Time between requests and number of loops can be variated to simulate
> the problem.
>          */
>         public void testRestorePage() {
>                 // time to wait between subsequent 'requests' (miliseconds)
>                 int time = 5;
>
>                 // number of loops
>                 int loops = 100;
>
>                 // creation of a non-bookmarkable and versioned page
>                 TestPage page = new TestPage();
>
>                 tester.setupRequestAndResponse();
>                 Session session = Session.get();
>
>                 IPageStore pageStore =
> ((SecondLevelCacheSessionStore)Application.get().getSessionStore()).getStore();
>
>                 // pagemap is needed for storage
>                 // (without that, the versionManager never increments the versionNumber
>                 //  because getLastVersion() is based on a cache in the pageMap)
>                 IPageMap pageMap = page.getPageMap();
>
>                 // store the original version 0
>                 page.setValue(0);
>                 pageMap.put(page);
>
>                 sleep(time);
>
>                 for (int i = 1; i <= loops; i++) {
>                         // first requestCycle: change value & store version 1
>                         pageStore.pageAccessed(session.getId(),page);
>                         page.setValue(i);
>                         pageMap.put(page);
>
>                         sleep(time);
>
>                         // second requestCycle: change value & store version 2
>                         pageStore.pageAccessed(session.getId(),page);
>                         page.setValue(0);
>                         pageMap.put(page);
>
>                         sleep(time);
>
>                         // third requestCycle: restore version 1 (expected value: i)
>                         pageStore.pageAccessed(session.getId(),page);
>                         page = (TestPage)pageStore.getPage(session.getId(),null,0,1,0);
>                         assertNotNull("loop "+i+": no page found for version 1",page);
>                         assertEquals("loop "+i+": incorrect page restore for version
> 1:",i,page.getValue());
>                         pageMap.put(page);
>
>                         sleep(time);
>
>                         // fourth requestCycle: restore version 0 (expected value: 0)
>                         pageStore.pageAccessed(session.getId(),page);
>                         page = (TestPage)pageStore.getPage(session.getId(),null,0,0,0);
>                         assertNotNull("loop "+i+": no page found for version 0",page);
>                         assertEquals("loop "+i+": incorrect page restore for version
> 0:",0,page.getValue());
>                         pageMap.put(page);
>
>                         sleep(time);
>                 }
>         }
> --
> View this message in context: http://www.nabble.com/Thread-synchronization-problems-in-FilePageStore-tf4063498.html#a11545253
> Sent from the Wicket - Dev mailing list archive at Nabble.com.
>
>