You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@directory.apache.org by Emmanuel Lecharny <el...@gmail.com> on 2008/12/13 14:25:07 UTC

Paged Search Control questions

Hi guys,

as I'm busy implenting this control, I have some questions about the RFC 
and some choices to be made.

The RFC (http://www.rfc-archive.org/getrfc.php?rfc=2696) is not really 
clear on some aspects concerning the size limit. Let me say I'm a bit 
baffled by the choice to define a paged size value in the control, when 
the sizeLimit would have been plain ok.

So here aere my questions :

1) considering that we have a server sizeLimit, a request sizeLimit and 
a page size limit, I'm wondering if we can simply ignore the request 
size limit. The page size limit can change, even if the paged result is 
being processed, but the RFC says "If the page size is greater than or 
equal to the sizeLimit value, the server should ignore the control as 
the request can be satisfied in a single page". Should I consider that 
the 'sizeLimit' is the request sizeLimit ? My personnal bet is : yes.

2) so second question : what if in one of the subsequent requests, the 
page size limit is changed and is superior to the sizeLimit ? This 
request sizeLimit cannot have changed, otherwise the search request 
would have been considered as a new search ( "...a searchRequest with 
all values identical to the initial request with the exception of the 
messageID, the cookie, and optionally a modified pageSize..."). My 
personal guess is again to consider that we should deliver as much 
entries as we can, up to the sizeLimit, and generate a LDAP error #4 : 
sizeLimitExceeded.

3) regarding the search request immutability : it's pretty hard to check 
that the filter hasn't changed, as it may be a complex one, with a 
different structure and a a different order. I think that this 
constraint is fully absurd, as the client will obviously create one 
request, and send a null cookie every time it will send a new paged 
search, so I don't see the validity of such a check. Nevertheless, 
should we try to implement such a check ? My personal guess, again, is 
that it's useless.

Wdyt ?

I'm also interested to have some feedback about how this control is 
handled by the other ldap servers, considering the many factors 
influencing this control :
- internal server size limit
- how many of such paged search can be handled for a single client
- what happens when we send a bad cookie to the server
- what happen when we play with the sizeLimit parameter


Last, not least, as we are using cursors to get the entries from the 
backend, we are able to move forward or backward. It would be 
interesting to extend this control to allow a backward pagedSearch (for 
instance, providing a negative paged size). Would it be interesting ?

I'm waiting for your opinion an enlightenments.

Many thanks!

-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org



Re: Paged Search Control questions

Posted by Emmanuel Lecharny <el...@gmail.com>.
Stefan Seelmann wrote:
> Emmanuel Lecharny schrieb:
>   
>> Hi guys,
>>
>> as I'm busy implenting this control, 
>>     
>
> Cool
>
>   
>> 1) considering that we have a server sizeLimit, a request sizeLimit and
>> a page size limit, I'm wondering if we can simply ignore the request
>> size limit. The page size limit can change, even if the paged result is
>> being processed, but the RFC says "If the page size is greater than or
>> equal to the sizeLimit value, the server should ignore the control as
>> the request can be satisfied in a single page". Should I consider that
>> the 'sizeLimit' is the request sizeLimit ? My personnal bet is : yes.
>>     
>
> Yes, I would say the 'sizeLimit' is the request sizeLimit.
>
> I think we should not ignore the requst sizeLimit, I would consider it
> as client-side limit for the complete search over all pages.
>
> Say the request sizeLimit is 10 and the page size is 8 for both requests
> then the first result contains 8 entries and the second contains 2
> entries plus a LDAP error #4.
>   
Makes perfect sense. So we will have to remember the number of already 
returned entries, and compare it
to the sizeLimit. Fine.
>> 3) regarding the search request immutability : it's pretty hard to check
>> that the filter hasn't changed, as it may be a complex one, with a
>> different structure and a a different order. I think that this
>> constraint is fully absurd, as the client will obviously create one
>> request, and send a null cookie every time it will send a new paged
>> search, so I don't see the validity of such a check. Nevertheless,
>> should we try to implement such a check ? My personal guess, again, is
>> that it's useless.
>>     
>
> To check if the filter changed logically is really complex. I think you
> could just check if the string representation or the bytes the are
> received were changed.
>   
This is what I currently do. There is nothing more I can do...
>   
>> I'm also interested to have some feedback about how this control is
>> handled by the other ldap servers, considering the many factors
>> influencing this control :
>> - internal server size limit
>>     
>
> Here different implementations are different.
> - OpenLDAP just stops if the server side limit is exceeded.
>   
I thought you can define soft and hard limit... I have to check in my 
favorite OpenLDAP book 'Mastering OpenLDAP'.
> - For MS AD has a default server side limit of 1000, but when this
> control is used you could get more
>   
Which violates the spirit of this limit... Thanks, AD !
>   
>> - how many of such paged search can be handled for a single client
>>     
> Infinite?
>   
Costly ... Currently, we don't have any limit, but at some point, we 
will need to discard some old searches, as each of them eat a big chunk 
of memory. I would say that a circular list of 10 paged search request 
should be a valid default. It may be configurable, too.

The main issue I have atm is that I have to implement a mechanism to 
discard timeout-ed requests.
>   
>> - what happens when we send a bad cookie to the server
>>     
> Either start a new search or return an error
>   
I think that sending an error is better.
>   
>> - what happen when we play with the sizeLimit parameter
>>     
> You mean the request sizeLimit? If it is changed it should be considered
> as a new search.
>   
You're right !
>   
>> Last, not least, as we are using cursors to get the entries from the
>> backend, we are able to move forward or backward. It would be
>> interesting to extend this control to allow a backward pagedSearch (for
>> instance, providing a negative paged size). Would it be interesting ?
>>     
>
> There is another control for this: VLV, see [1]. But is also need
> server-side sorting.
>   
True ! I forgot about this guy...

Thanks Stefan, very interesting responses ! I was running in circles, I 
start to see the light now :)

-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org



Re: Paged Search Control questions

Posted by Emmanuel Lecharny <el...@gmail.com>.
Stefan Seelmann wrote:
> Emmanuel Lecharny wrote:
>   
>> Another one :
>>
>> suppose we have a normal user doing a search request with a sizeLimit of
>> 10, with the server limit set to 5, and the potential result would be 7
>> entries (so the result will be truncated to 5 entries due to the server
>> limit) :
>> - should we generate a SizeLimitExceededException ?
>>     
>
> Do we generate such an SizeLimitExceededException when doing a normal
> search request without the paged search control? I guess yes. So I think
> we should also return an LDAP code 4 here.
>   
The RFC says : "sizeLimitExceeded (4) : Indicates that the size limit 
specified by the client was exceeded before the operation could be 
completed."

So if the client does not specify a sizeLimit, but the server has one, I 
don't know if we should generate a code 4 ...

-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org



Re: Paged Search Control questions

Posted by Howard Chu <hy...@symas.com>.
Emmanuel Lecharny wrote:
> Howard Chu wrote:
> This was a part of the spec I mis-understood, clearly. It makes a hell
> of sense to consider that the sizeLimit is global, ad should be
> considered across the multiple paged searches. The funny part is that I
> have injected such a counter in the internal structure before asking the
> question, I don't know what for, and now, I see the reason. The
> reptilian part of my brain knew it ;)

Gotta love that, coding by instinct... ;)

>> MS AD is broken in this respect (which is particularly pathetic given
>> that some MS folks co-authored the spec, but so it goes).
>> I.e., when no control is present, the search result set will be the
>> smaller of the client's requested size limit, and any administrative
>> limits configured on the server. With the control present, the total
>> number of returned entries allowed is still the same, just that they
>> may be received by the client in smaller groups.

> On don't see where is the problem here ...

My paragraph above may have been a bit muddy. In general there's no problem here.

> so in the second case, the client will receive entries in PL sized
> groups, until we reach the server limit or the request limit ?

Right.

The problem I was alluding to with MS AD is that when a Paging control is 
used, the server's sizeLimit is effectively ignored. MS AD has a default 
sizeLimit of 1000 entries, but using Paging you can retrieve as many 
1000-entry pages as you want, thus bypassing that limit. That's of course 
completely wrong behavior, but that's MS for ya...

> That's the funniest part of the job : deal with other's errors when you
> already have enough of your own in your plate !

Yeah, keeps things from getting too easy I suppose. ;)
>
> Thanks Howard !

Any time. (Except when I'm in the wilderness without a good network connection :P
-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Re: Paged Search Control questions

Posted by Emmanuel Lecharny <el...@gmail.com>.
Howard Chu wrote:
> Emmanuel Lecharny wrote:
>> Stefan Seelmann wrote:
>>> Emmanuel Lecharny wrote:
>>>
>>>> Another one :
>>>>
>>>> suppose we have a normal user doing a search request with a 
>>>> sizeLimit of
>>>> 10, with the server limit set to 5, and the potential result would 
>>>> be 7
>>>> entries (so the result will be truncated to 5 entries due to the 
>>>> server
>>>> limit) :
>>>> - should we generate a SizeLimitExceededException ?
>>>>
>>> Do we generate such an SizeLimitExceededException when doing a normal
>>> search request without the paged search control? I guess yes. So I 
>>> think
>>> we should also return an LDAP code 4 here.
>>>
>> Seems like Openldap behaves this way. So ADS will generate a code 4 if
>> the server size limit is exceeded.
>>
>> Thanks Stefan !
>
> Sorry for jumping in here late, wanted to reply earlier but on dialup 
> right now so it's not always convenient... Still, it looks like Stefan 
> covered all the bases already.
Yep, and you confirmed that he was plain right :) Sometime, when you try 
to implement something, you'd better ask those who 'know' :)
>
> To summarize: using the Paging control should only be considered a 
> form of XON/XOFF flow control for a single Search request. It cannot 
> (MUST not) change the server's behavior wrt the overall size limits in 
> effect. 
This was a part of the spec I mis-understood, clearly. It makes a hell 
of sense to consider that the sizeLimit is global, ad should be 
considered across the multiple paged searches. The funny part is that I 
have injected such a counter in the internal structure before asking the 
question, I don't know what for, and now, I see the reason. The 
reptilian part of my brain knew it ;)
> MS AD is broken in this respect (which is particularly pathetic given 
> that some MS folks co-authored the spec, but so it goes).
> I.e., when no control is present, the search result set will be the 
> smaller of the client's requested size limit, and any administrative 
> limits configured on the server. With the control present, the total 
> number of returned entries allowed is still the same, just that they 
> may be received by the client in smaller groups.
On don't see where is the problem here ... Suppose you have a server 
size limit (SL), a request size limit (RL) and a paged size limit (PL), 
the expected behavior when not using the control should be :

- return min( SL, RL) entries

and if the control is present :

if ( PL < min ( SL, RL ) )
  while ( count < min( SL, RL ) ) do
    return PL entries or less
    count += number of returned entries
  done
else
  return min( SL, RL )

so in the second case, the client will receive entries in PL sized 
groups, until we reach the server limit or the request limit ?
>
> Also beware of another issue: the spec says the page size is an 
> Integer but MS AD implements it as an unsigned, and there are MS 
> clients out there that expect to be able to set the max size (god 
> knows why, since that effectively disables paging) 
MS client, you mean the utterly crap named LDP ? Ok, we have to take 
care of that, and consider that a negative number means something like 
the max number of entries then. Easy. Thanks for the warning :)
> and will implode when they receive a ProtocolError in response to such 
> an erroneous request. (I've forgotten the ID# of the bug report in our 
> tracker...) All in all it's a crummy spec and you can pretty much bet 
> that when you run across a client that depends on it, the client is 
> broken in a lot of other ways.
That's the funniest part of the job : deal with other's errors when you 
already have enough of your own in your plate !

Thanks Howard !


-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org



Re: Paged Search Control questions

Posted by Howard Chu <hy...@symas.com>.
Emmanuel Lecharny wrote:
> Stefan Seelmann wrote:
>> Emmanuel Lecharny wrote:
>>
>>> Another one :
>>>
>>> suppose we have a normal user doing a search request with a sizeLimit of
>>> 10, with the server limit set to 5, and the potential result would be 7
>>> entries (so the result will be truncated to 5 entries due to the server
>>> limit) :
>>> - should we generate a SizeLimitExceededException ?
>>>
>> Do we generate such an SizeLimitExceededException when doing a normal
>> search request without the paged search control? I guess yes. So I think
>> we should also return an LDAP code 4 here.
>>
> Seems like Openldap behaves this way. So ADS will generate a code 4 if
> the server size limit is exceeded.
>
> Thanks Stefan !

Sorry for jumping in here late, wanted to reply earlier but on dialup right 
now so it's not always convenient... Still, it looks like Stefan covered all 
the bases already.

To summarize: using the Paging control should only be considered a form of 
XON/XOFF flow control for a single Search request. It cannot (MUST not) change 
the server's behavior wrt the overall size limits in effect. MS AD is broken 
in this respect (which is particularly pathetic given that some MS folks 
co-authored the spec, but so it goes). I.e., when no control is present, the 
search result set will be the smaller of the client's requested size limit, 
and any administrative limits configured on the server. With the control 
present, the total number of returned entries allowed is still the same, just 
that they may be received by the client in smaller groups.

Also beware of another issue: the spec says the page size is an Integer but MS 
AD implements it as an unsigned, and there are MS clients out there that 
expect to be able to set the max size (god knows why, since that effectively 
disables paging) and will implode when they receive a ProtocolError in 
response to such an erroneous request. (I've forgotten the ID# of the bug 
report in our tracker...) All in all it's a crummy spec and you can pretty 
much bet that when you run across a client that depends on it, the client is 
broken in a lot of other ways.
-- 
    -- Howard Chu
    CTO, Symas Corp.           http://www.symas.com
    Director, Highland Sun     http://highlandsun.com/hyc/
    Chief Architect, OpenLDAP  http://www.openldap.org/project/

Re: Paged Search Control questions

Posted by Emmanuel Lecharny <el...@gmail.com>.
Stefan Seelmann wrote:
> Emmanuel Lecharny wrote:
>   
>> Another one :
>>
>> suppose we have a normal user doing a search request with a sizeLimit of
>> 10, with the server limit set to 5, and the potential result would be 7
>> entries (so the result will be truncated to 5 entries due to the server
>> limit) :
>> - should we generate a SizeLimitExceededException ?
>>     
>
> Do we generate such an SizeLimitExceededException when doing a normal
> search request without the paged search control? I guess yes. So I think
> we should also return an LDAP code 4 here.
>   
Seems like Openldap behaves this way. So ADS will generate a code 4 if 
the server size limit is exceeded.

Thanks Stefan !


-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org



Re: Paged Search Control questions

Posted by Stefan Seelmann <se...@apache.org>.
Emmanuel Lecharny wrote:
> Another one :
> 
> suppose we have a normal user doing a search request with a sizeLimit of
> 10, with the server limit set to 5, and the potential result would be 7
> entries (so the result will be truncated to 5 entries due to the server
> limit) :
> - should we generate a SizeLimitExceededException ?

Do we generate such an SizeLimitExceededException when doing a normal
search request without the paged search control? I guess yes. So I think
we should also return an LDAP code 4 here.

Regards,
Stefan


Re: Paged Search Control questions

Posted by Emmanuel Lecharny <el...@gmail.com>.
Stefan Seelmann wrote:
> Emmanuel Lecharny wrote:
>   
>> One more question :
>>
>> currently, if the request is send by the administrator, we don't respect
>> the server limit. Is this a good decision ? (of course, the request
>> sizeLimit is still valid).
>>
>>     
>
> I think it's ok. The admin has special priviliges, like root. This way
> the admin is able to get all entries. If I remeber correctly also ACIs
> are ignored for the admin.
>   
Ok, thanks.

Another one :

suppose we have a normal user doing a search request with a sizeLimit of 
10, with the server limit set to 5, and the potential result would be 7 
entries (so the result will be truncated to 5 entries due to the server 
limit) :
- should we generate a SizeLimitExceededException ?
> Regards,
> Stefan
>
>
>
>   


-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org



Re: Paged Search Control questions

Posted by Stefan Seelmann <se...@apache.org>.
Emmanuel Lecharny wrote:
> One more question :
> 
> currently, if the request is send by the administrator, we don't respect
> the server limit. Is this a good decision ? (of course, the request
> sizeLimit is still valid).
> 

I think it's ok. The admin has special priviliges, like root. This way
the admin is able to get all entries. If I remeber correctly also ACIs
are ignored for the admin.

Regards,
Stefan



Re: Paged Search Control questions

Posted by Emmanuel Lecharny <el...@gmail.com>.
One more question :

currently, if the request is send by the administrator, we don't respect 
the server limit. Is this a good decision ? (of course, the request 
sizeLimit is still valid).

-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org



Re: Paged Search Control questions

Posted by Stefan Seelmann <se...@apache.org>.
Emmanuel Lecharny schrieb:
> Hi guys,
> 
> as I'm busy implenting this control, 

Cool

> 1) considering that we have a server sizeLimit, a request sizeLimit and
> a page size limit, I'm wondering if we can simply ignore the request
> size limit. The page size limit can change, even if the paged result is
> being processed, but the RFC says "If the page size is greater than or
> equal to the sizeLimit value, the server should ignore the control as
> the request can be satisfied in a single page". Should I consider that
> the 'sizeLimit' is the request sizeLimit ? My personnal bet is : yes.

Yes, I would say the 'sizeLimit' is the request sizeLimit.

I think we should not ignore the requst sizeLimit, I would consider it
as client-side limit for the complete search over all pages.

Say the request sizeLimit is 10 and the page size is 8 for both requests
then the first result contains 8 entries and the second contains 2
entries plus a LDAP error #4.

> 2) so second question : what if in one of the subsequent requests, the
> page size limit is changed and is superior to the sizeLimit ? This
> request sizeLimit cannot have changed, otherwise the search request
> would have been considered as a new search ( "...a searchRequest with
> all values identical to the initial request with the exception of the
> messageID, the cookie, and optionally a modified pageSize..."). My
> personal guess is again to consider that we should deliver as much
> entries as we can, up to the sizeLimit, and generate a LDAP error #4 :
> sizeLimitExceeded.

Again, I think the request sizeLimit is the total number of returned
entries.

> 3) regarding the search request immutability : it's pretty hard to check
> that the filter hasn't changed, as it may be a complex one, with a
> different structure and a a different order. I think that this
> constraint is fully absurd, as the client will obviously create one
> request, and send a null cookie every time it will send a new paged
> search, so I don't see the validity of such a check. Nevertheless,
> should we try to implement such a check ? My personal guess, again, is
> that it's useless.

To check if the filter changed logically is really complex. I think you
could just check if the string representation or the bytes the are
received were changed.

> I'm also interested to have some feedback about how this control is
> handled by the other ldap servers, considering the many factors
> influencing this control :
> - internal server size limit

Here different implementations are different.
- OpenLDAP just stops if the server side limit is exceeded.
- For MS AD has a default server side limit of 1000, but when this
control is used you could get more

> - how many of such paged search can be handled for a single client
Infinite?

> - what happens when we send a bad cookie to the server
Either start a new search or return an error

> - what happen when we play with the sizeLimit parameter
You mean the request sizeLimit? If it is changed it should be considered
as a new search.

> Last, not least, as we are using cursors to get the entries from the
> backend, we are able to move forward or backward. It would be
> interesting to extend this control to allow a backward pagedSearch (for
> instance, providing a negative paged size). Would it be interesting ?

There is another control for this: VLV, see [1]. But is also need
server-side sorting.

Regards,
Stefan


[1] https://issues.apache.org/jira/browse/DIRSERVER-1265