You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cocoon.apache.org by Simone Gianni <s....@thebug.it> on 2006/07/28 15:43:47 UTC

GSoC, repeater pagination in place

Hi all,
I just committed the repeater pagination made by Matthias for his GSoC.
He's here with me in Rome now, so that we can work together on this
stuff. Could you please review it? We tested it on a big application
where repeaters are used to display very big lists (11k items) backed by
hibernate, and it works correctly.

There are still some open problems for which we'd like to receive
feedback from the community, Matthias will explain them.

P.S. Google Maps stuff is also already working, we're now evaluating the
possibility to add geocoding facilities in it, will hear about it in
near future.

Simone

Re: GSoC, repeater pagination in place

Posted by Simone Gianni <s....@thebug.it>.

Leszek Gawron wrote:

> Simone Gianni wrote:
>
>> Actually we decided to use the List interface for it, and on the test
>> project we have a List implementation that actually works on a Hibernate
>> Criteria in a way that it's able to retrieve blocks of records from
>> hibernate and ask directly to the database the size, only implementing
>> some basic methods of the List interface.
>
> Does it fetch a whole page or a "window" with predefined size?

It's initialized with new HibernateList(DetachedCriteria criteria,
Session session, int fetchSize); when you call get(10) it will test if
it has already fetched, if not will fetch from 10 to 10+fetchSize
element. The size() method is implemented moving the cursor to the last
row and returning the index of it. As told we're testing it on a 11000
very heavy object list, and it's working perfectly, we see only 10
selects done to fetch the objects when the get(x) method is first called.

>
> Would you be willing to make it public?
>
Yep. It will be part of the documentation actually, because we don't
want to introduce a dependency on Hibernate in cocoon itself, anyway
it's apache license so it will be public.

Simone

-- 
Simone Gianni

Re: GSoC, repeater pagination in place

Posted by Leszek Gawron <lg...@mobilebox.pl>.
Simone Gianni wrote:
> Hi Leszek,
> 
> 
> Leszek Gawron wrote:
> 
>> What about:
>> - goto a specific page action
>> - change page size action
> 
> The goto a specific page action already exist in some preliminary form,
> we should solve the page to index conversion problem before actually
> have a working one.
> 
>> I would like to propose something else:
>> public interface ValueListProvider() {
>>   public List getRows( SomeContextForFilteringAndSorting context, long
>> offset, long rowCount );
>>   public long getTotalRowCount( SomeContextForFilteringAndSorting
>> context );
>> }
> 
> Actually we decided to use the List interface for it, and on the test
> project we have a List implementation that actually works on a Hibernate
> Criteria in a way that it's able to retrieve blocks of records from
> hibernate and ask directly to the database the size, only implementing
> some basic methods of the List interface.
Does it fetch a whole page or a "window" with predefined size?

Would you be willing to make it public?

-- 
Leszek Gawron                                      lgawron@mobilebox.pl
IT Manager                                         MobileBox sp. z o.o.
+48 (61) 855 06 67                              http://www.mobilebox.pl
mobile: +48 (501) 720 812                       fax: +48 (61) 853 29 65

Re: GSoC, repeater pagination in place

Posted by Simone Gianni <s....@thebug.it>.
Hi Leszek,


Leszek Gawron wrote:

> What about:
> - goto a specific page action
> - change page size action

The goto a specific page action already exist in some preliminary form,
we should solve the page to index conversion problem before actually
have a working one.

>
> I would like to propose something else:
> public interface ValueListProvider() {
>   public List getRows( SomeContextForFilteringAndSorting context, long
> offset, long rowCount );
>   public long getTotalRowCount( SomeContextForFilteringAndSorting
> context );
> }

Actually we decided to use the List interface for it, and on the test
project we have a List implementation that actually works on a Hibernate
Criteria in a way that it's able to retrieve blocks of records from
hibernate and ask directly to the database the size, only implementing
some basic methods of the List interface.

>
>> Concerning this we face some "indexing problems" after a page change.
>> Row additions/deletions and sorting change the order and count of
>> rows, therefore we need a technique to obtain the right start-index
>> when we jump to a custom page. It's not guaranteed that our
>> start-index in the collection is 100 if pageSize=10 and
>> requestedPage=10.
>
>
> It may be a little bit too hot here because I cannot think of a reason
> why. Why is that?

We have a collection gained from the load binding, and then some
deletions and additions made to this collection, which are not in the
collection until the save method of binding is called. This means that i
have page 1 going from item 0 to item 10 of the collection, but if
element 5 has been deleted (actually marked for deletion) page 1 goes
from 0 to 11, and page 2 starts at 12 instead than 11, and so on.

What makes this harder is the fact that saying "element 5 has been
deleted" is relative to the actualy sorting of the table, since if we
change the sort order the previous element 5 could now became element
35, and so shift only form the page 3 on.

>>
>> These are our thoughts about it:
>>
>> The previous and next actions are still able work because we could
>> remember the fist and last index of the current page and start
>> relative to them to fetch the next n rows.
>>
>> First and last page could work the same way using 0 and
>> collection.size().
>
>>
>> Do you think it's a real problem to have not exact starting indexes
>> while jumping between pages in big repeaters?
>
>
> I would rather have a little bit inexact results than something that
> would kill my server if 100 users used it at the same time.

Yep, we are asking this because it could be possible that we don't
manage to have precise indexing on a sorted repeater without fetching
all elements in the collection, but we could use approximate positioning
(like, page 5 starts at 50) and then adjust it as needed if we manage to.

So it means +1 for not precise positioning over performances.

Simone

Re: GSoC, repeater pagination in place

Posted by Leszek Gawron <lg...@mobilebox.pl>.
Matthias Epheser wrote:
> As Simone mentioned we are currently working together on the repeater 
> pagination. I want to explain shortly how it works what problems still 
> exist.
This is awesome!

> 
> Pagination can now be achieved by just adding a <fd:pages initial="1" 
> size="20"/> tag to the repeater's definition. I added repeater actions 
> as well for first, previous, next and last page.
What about:
- goto a specific page action
- change page size action

> The actual pageLoad/pageSave takes place in the binding. A storage area 
> is used there to cache updated rows on page change. Once the user 
> submits, the actual saving to the JXPathContext takes place.
> 
> To support really big lists managed by a persistency frameworks (like in 
> the application Simone mentioned) we implemented the possibility to use 
> "lazy collections". A lazy collection is simply an implementation of the 
> java.util.Collection interface that knows how to handle size()- or 
> get(i)- calls without fetching the whole data from the db.

There are two "levels" of lazy collections:

1. OnlyALittleBitLazy(tm) which fetches full collection contents on 
first collection call ( even collection.size() ); This collection is no 
use for paging - does not scale.

2. TotallyLazy(tm) which fetches all entity ids on first collection call 
and then uses separate queries to fetch EACH entity. This one seriously 
faces the famous (n+1) problem (apart from the fact that if you want
to paginate through a table of 100k entries you won't select 100k
entities but still you'll have a collection of 100k ids).

In order to display 100 entries in a particular page 100 more queries 
are needed. Performance goes down a lot.


I would like to propose something else:
public interface ValueListProvider() {
   public List getRows( SomeContextForFilteringAndSorting context, long 
offset, long rowCount );
   public long getTotalRowCount( SomeContextForFilteringAndSorting 
context );
}

This way you always have only two queries to run:

select * from Entity e where e.name like '%filter%' limit 100 offset 3
select count( * ) from Entity e where e.name like '%filter%'

and you can always wrap your lazy collection with ValueListProvider.


> Another feature we want to implement is sorting, more precisely 
> displaying sorted rows. It's not our intention to actually store the 
> data to the object in a different order but to make it possible for the 
> user to click on the column-header and get the rows displayed in 
> ascendant/descendant order. We think we need some data-providing class 
> that act as a layer between the binding storage and the repeater rows, 
> that provides sort(columnName) and getRows(from,to) or similar. We now 
> have to evaluate how this could be done in a decent way and keep the 
> door open for our lazy list here.
> 
> Concerning this we face some "indexing problems" after a page change. 
> Row additions/deletions and sorting change the order and count of rows, 
> therefore we need a technique to obtain the right start-index when we 
> jump to a custom page. It's not guaranteed that our start-index in the 
> collection is 100 if pageSize=10 and requestedPage=10.

It may be a little bit too hot here because I cannot think of a reason 
why. Why is that?

> 
> These are our thoughts about it:
> 
> The previous and next actions are still able work because we could 
> remember the fist and last index of the current page and start relative 
> to them to fetch the next n rows.
> 
> First and last page could work the same way using 0 and collection.size().

First page is of course 0
Last page could be: java.lang.Math.ceil( collectionSize / pageSize ) - 1 
) * pageSize

> 
> So the problem is located in the goto-widget. We are thinking of a 
> couple possible solutions. Precise positioning would only be possible if 
> we use iteration and check all rows to compute the correct starting 
> point. Another solution would be to use "approximative" starting 
> indexes. That means that, facing large collections, we always take 
> pageSumber*pageSize as the starting index after the custom-page action. 
> This approach solves the (maybe resource intensive) iteration problem 
> but could provide results that are not exact. Another option would be to 
> move this problem to the not yet existing data-providing class I 
> mentioned in the sorting part.

> 
> Do you think it's a real problem to have not exact starting indexes 
> while jumping between pages in big repeaters?

I would rather have a little bit inexact results than something that 
would kill my server if 100 users used it at the same time.

-- 
Leszek Gawron                                      lgawron@mobilebox.pl
IT Manager                                         MobileBox sp. z o.o.
+48 (61) 855 06 67                              http://www.mobilebox.pl
mobile: +48 (501) 720 812                       fax: +48 (61) 853 29 65

Re: GSoC, repeater pagination in place

Posted by Matthias Epheser <ma...@gmx.at>.
Simone Gianni schrieb:
> Hi all,
> I just committed the repeater pagination made by Matthias for his GSoC.
> He's here with me in Rome now, so that we can work together on this
> stuff. Could you please review it? We tested it on a big application
> where repeaters are used to display very big lists (11k items) backed by
> hibernate, and it works correctly.
> 
> There are still some open problems for which we'd like to receive
> feedback from the community, Matthias will explain them.
> 
> P.S. Google Maps stuff is also already working, we're now evaluating the
> possibility to add geocoding facilities in it, will hear about it in
> near future.
> 
> Simone
> 
> 

Hi,

As Simone mentioned we are currently working together on the repeater 
pagination. I want to explain shortly how it works what problems still 
exist.

Pagination can now be achieved by just adding a <fd:pages initial="1" 
size="20"/> tag to the repeater's definition. I added repeater actions 
as well for first, previous, next and last page.

The actual pageLoad/pageSave takes place in the binding. A storage area 
is used there to cache updated rows on page change. Once the user 
submits, the actual saving to the JXPathContext takes place.

To support really big lists managed by a persistency frameworks (like in 
the application Simone mentioned) we implemented the possibility to use 
"lazy collections". A lazy collection is simply an implementation of the 
java.util.Collection interface that knows how to handle size()- or 
get(i)- calls without fetching the whole data from the db.

Another feature we want to implement is sorting, more precisely 
displaying sorted rows. It's not our intention to actually store the 
data to the object in a different order but to make it possible for the 
user to click on the column-header and get the rows displayed in 
ascendant/descendant order. We think we need some data-providing class 
that act as a layer between the binding storage and the repeater rows, 
that provides sort(columnName) and getRows(from,to) or similar. We now 
have to evaluate how this could be done in a decent way and keep the 
door open for our lazy list here.

Concerning this we face some "indexing problems" after a page change. 
Row additions/deletions and sorting change the order and count of rows, 
therefore we need a technique to obtain the right start-index when we 
jump to a custom page. It's not guaranteed that our start-index in the 
collection is 100 if pageSize=10 and requestedPage=10.

These are our thoughts about it:

The previous and next actions are still able work because we could 
remember the fist and last index of the current page and start relative 
to them to fetch the next n rows.

First and last page could work the same way using 0 and collection.size().

So the problem is located in the goto-widget. We are thinking of a 
couple possible solutions. Precise positioning would only be possible if 
we use iteration and check all rows to compute the correct starting 
point. Another solution would be to use "approximative" starting 
indexes. That means that, facing large collections, we always take 
pageSumber*pageSize as the starting index after the custom-page action. 
This approach solves the (maybe resource intensive) iteration problem 
but could provide results that are not exact. Another option would be to 
move this problem to the not yet existing data-providing class I 
mentioned in the sorting part.

Do you think it's a real problem to have not exact starting indexes 
while jumping between pages in big repeaters?

Regards,
Matthias