You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@directory.apache.org by "Jim Birch (JIRA)" <ji...@apache.org> on 2008/05/16 07:16:55 UTC

[jira] Created: (DIRSTUDIO-327) Repeated searches for complete result set.

Repeated searches for complete result set.
------------------------------------------

                 Key: DIRSTUDIO-327
                 URL: https://issues.apache.org/jira/browse/DIRSTUDIO-327
             Project: Directory Studio
          Issue Type: Improvement
    Affects Versions: 1.2.0
         Environment: Win2k workstation, java 1.6.0_05

            Reporter: Jim Birch


Windows servers have a default server side limit of 1000 returned objects.  AFAIK the normal way of handling this is to detect that a search returns an incomplete result set and make further requests to span the full result set.

There appears to be no such capability in dirstudio which makes searching 15K users extremely messy.
   

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (DIRSTUDIO-327) Add support for Paged Results Control

Posted by "Jim Birch (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DIRSTUDIO-327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627896#action_12627896 ] 

Jim Birch commented on DIRSTUDIO-327:
-------------------------------------

Thanks Stefan,

Browsing the structure works with no problems.

I can't get more than 1000 results for a search.  Have I missed a setting somewhere, or do I get paged result only on browse? This isn't a showstopper.  I'm typically searching for a few entries that match something.

I also noticed that my Excel export slowed to a crawl then runs out of Java heap space after around 1540 items out of a 13K list.  There's a warning on that so to be expected.  I could bump the heap space but   CSV works ok - probably a better way to go anyway.

Regards
Jim
 

> Add support for Paged Results Control
> -------------------------------------
>
>                 Key: DIRSTUDIO-327
>                 URL: https://issues.apache.org/jira/browse/DIRSTUDIO-327
>             Project: Directory Studio
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>         Environment: Win2k workstation, java 1.6.0_05
>            Reporter: Jim Birch
>            Assignee: Stefan Seelmann
>             Fix For: 1.3.0
>
>
> Windows servers have a default server side limit of 1000 returned objects.  AFAIK the normal way of handling this is to detect that a search returns an incomplete result set and make further requests to span the full result set.
> There appears to be no such capability in dirstudio which makes searching 15K users extremely messy.
>    

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (DIRSTUDIO-327) Add support for Paged Results Control

Posted by "Stefan Seelmann (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DIRSTUDIO-327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stefan Seelmann resolved DIRSTUDIO-327.
---------------------------------------

    Resolution: Fixed

Fixed here:
  http://svn.apache.org/viewcvs?view=rev&rev=691070

> Add support for Paged Results Control
> -------------------------------------
>
>                 Key: DIRSTUDIO-327
>                 URL: https://issues.apache.org/jira/browse/DIRSTUDIO-327
>             Project: Directory Studio
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>         Environment: Win2k workstation, java 1.6.0_05
>            Reporter: Jim Birch
>            Assignee: Stefan Seelmann
>             Fix For: 1.3.0
>
>
> Windows servers have a default server side limit of 1000 returned objects.  AFAIK the normal way of handling this is to detect that a search returns an incomplete result set and make further requests to span the full result set.
> There appears to be no such capability in dirstudio which makes searching 15K users extremely messy.
>    

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (DIRSTUDIO-327) Repeated searches for complete result set.

Posted by "Emmanuel Lecharny (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DIRSTUDIO-327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12597359#action_12597359 ] 

Emmanuel Lecharny commented on DIRSTUDIO-327:
---------------------------------------------

There is a 'paged control' RFC (http://www.ietf.org/rfc/rfc2696.txt) which is implemented by AD. 

We just have to implement it in Studio (an add an option to set it up).

> Repeated searches for complete result set.
> ------------------------------------------
>
>                 Key: DIRSTUDIO-327
>                 URL: https://issues.apache.org/jira/browse/DIRSTUDIO-327
>             Project: Directory Studio
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>         Environment: Win2k workstation, java 1.6.0_05
>            Reporter: Jim Birch
>
> Windows servers have a default server side limit of 1000 returned objects.  AFAIK the normal way of handling this is to detect that a search returns an incomplete result set and make further requests to span the full result set.
> There appears to be no such capability in dirstudio which makes searching 15K users extremely messy.
>    

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (DIRSTUDIO-327) Add support for Paged Results Control

Posted by "Jim Birch (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DIRSTUDIO-327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12628204#action_12628204 ] 

Jim Birch commented on DIRSTUDIO-327:
-------------------------------------

Stefan

Your analysis of what I did is right.  Thanks for the memory info.  If I need to dump any big lumps of data I cab go via CSV, or bump the heap right up for the job.  I've got a few Gb to play with on my desktop :-) 

Thanks for your efforts.  DirStudio is really well thought out and put together, a pleasure to use.  All that at only V1.3!   It's has been a brilliant tool for data analysis side of the project I'm working on: identity management and provisioning of users in AD and various other applications using HR data exports.  Keep up the good work.

Regards, Jim

> Add support for Paged Results Control
> -------------------------------------
>
>                 Key: DIRSTUDIO-327
>                 URL: https://issues.apache.org/jira/browse/DIRSTUDIO-327
>             Project: Directory Studio
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>         Environment: Win2k workstation, java 1.6.0_05
>            Reporter: Jim Birch
>            Assignee: Stefan Seelmann
>             Fix For: 1.3.0
>
>
> Windows servers have a default server side limit of 1000 returned objects.  AFAIK the normal way of handling this is to detect that a search returns an incomplete result set and make further requests to span the full result set.
> There appears to be no such capability in dirstudio which makes searching 15K users extremely messy.
>    

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (DIRSTUDIO-327) Repeated searches for complete result set.

Posted by "Jim Birch (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DIRSTUDIO-327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626386#action_12626386 ] 

Jim Birch commented on DIRSTUDIO-327:
-------------------------------------

Any chance of getting this issue bumped into a close release?  I'm kinda in love with DirStudio - she's beautiful - but this is putting pressure on the relationship ;-)

The alternate workaround of bumping the server side limit is considered very bad practice:  This setting will apply to across all controllers in the Active Directory domain so leaves a lot of targets open to a DoS attack in a corporate situation.   Everything I've seen warns against it, eg: [http://searchwindowsserver.techtarget.com/tip/0,289483,sid68_gci1265206,00.html] and some management tools automatically flag a server side limit above 1000 an a problem.

I guess there are a lot of other AD ldap users who could use the tool if it has paged search support.  There's a few here.

I'm not sure about the ApacheDS design philosophy, but paged searched might be a good idea for ApacheDS too.

The code changes looks well-contained and pretty easy to implement.  It would make the LDAP implementation more complete.  There's some sample code here, if it helps: [http://java.sun.com/docs/books/tutorial/jndi/newstuff/paged-results.html].

I wouldn't want to mess with the code myself but *I'm happy to do some AD testing here.*

> Repeated searches for complete result set.
> ------------------------------------------
>
>                 Key: DIRSTUDIO-327
>                 URL: https://issues.apache.org/jira/browse/DIRSTUDIO-327
>             Project: Directory Studio
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>         Environment: Win2k workstation, java 1.6.0_05
>            Reporter: Jim Birch
>
> Windows servers have a default server side limit of 1000 returned objects.  AFAIK the normal way of handling this is to detect that a search returns an incomplete result set and make further requests to span the full result set.
> There appears to be no such capability in dirstudio which makes searching 15K users extremely messy.
>    

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (DIRSTUDIO-327) Add support for Paged Results Control

Posted by "Stefan Seelmann (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DIRSTUDIO-327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627969#action_12627969 ] 

Stefan Seelmann commented on DIRSTUDIO-327:
-------------------------------------------

I think I was able to reproduce what you done: You first searched 13K entries and they were displayed in the search result editor, then you exported them, right?

Perfoming a search with 10K results and displaying them in the search result editor costs about 50MB memory. The search results are cached in memory as long as your connection is opened. 5kB for one entry sounds much, it consists of our internal object model (entry, DN, RDN, attributes, values, parent-child relationship) and the UI objects to display our object model (Table, Rows, Column, Fonts)

Exporting 10K entries to Excel costs even about 50MB memory. We use the Apache POI library for that and we must create the excel file in memory.

Exporting to LDIF or CSV is cheaper because each entry received from the server is immediately streamed to the file. (With LDIF the CPU usage is too high, need to check that...)

So 40MB (Studio/Eclipse footprint) + 50MB (10000 search results) + 50MB (Excel export) is more the 128MB default heap size.

So what I could suggest (if appropriate)
- you already increased heap memory
- only perform small searches within Studio, use the count limit and/or paged search
- If you need to export large data
  - use CSV (you already do) or LDIF 
  - if you need excel first close all connections (to flush caches), open the right connection, run the export without performing a search

Hm, perhaps we should start a new process for each search and export, like g**gle does with chr*me ;-)


> Add support for Paged Results Control
> -------------------------------------
>
>                 Key: DIRSTUDIO-327
>                 URL: https://issues.apache.org/jira/browse/DIRSTUDIO-327
>             Project: Directory Studio
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>         Environment: Win2k workstation, java 1.6.0_05
>            Reporter: Jim Birch
>            Assignee: Stefan Seelmann
>             Fix For: 1.3.0
>
>
> Windows servers have a default server side limit of 1000 returned objects.  AFAIK the normal way of handling this is to detect that a search returns an incomplete result set and make further requests to span the full result set.
> There appears to be no such capability in dirstudio which makes searching 15K users extremely messy.
>    

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Closed: (DIRSTUDIO-327) Add support for Paged Results Control

Posted by "Pierre-Arnaud Marcelot (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DIRSTUDIO-327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pierre-Arnaud Marcelot closed DIRSTUDIO-327.
--------------------------------------------


> Add support for Paged Results Control
> -------------------------------------
>
>                 Key: DIRSTUDIO-327
>                 URL: https://issues.apache.org/jira/browse/DIRSTUDIO-327
>             Project: Directory Studio
>          Issue Type: Improvement
>          Components: studio-ldapbrowser
>    Affects Versions: 1.2.0
>         Environment: Win2k workstation, java 1.6.0_05
>            Reporter: Jim Birch
>            Assignee: Stefan Seelmann
>             Fix For: 1.3.0
>
>
> Windows servers have a default server side limit of 1000 returned objects.  AFAIK the normal way of handling this is to detect that a search returns an incomplete result set and make further requests to span the full result set.
> There appears to be no such capability in dirstudio which makes searching 15K users extremely messy.
>    

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (DIRSTUDIO-327) Add support for Paged Results Control

Posted by "Emmanuel Lecharny (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DIRSTUDIO-327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627906#action_12627906 ] 

Emmanuel Lecharny commented on DIRSTUDIO-327:
---------------------------------------------

Which JVM are you using ? With which flags ?

> Add support for Paged Results Control
> -------------------------------------
>
>                 Key: DIRSTUDIO-327
>                 URL: https://issues.apache.org/jira/browse/DIRSTUDIO-327
>             Project: Directory Studio
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>         Environment: Win2k workstation, java 1.6.0_05
>            Reporter: Jim Birch
>            Assignee: Stefan Seelmann
>             Fix For: 1.3.0
>
>
> Windows servers have a default server side limit of 1000 returned objects.  AFAIK the normal way of handling this is to detect that a search returns an incomplete result set and make further requests to span the full result set.
> There appears to be no such capability in dirstudio which makes searching 15K users extremely messy.
>    

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (DIRSTUDIO-327) Add support for Paged Results Control

Posted by "Stefan Seelmann (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DIRSTUDIO-327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627929#action_12627929 ] 

Stefan Seelmann commented on DIRSTUDIO-327:
-------------------------------------------

Hi Jim,

glad to hear that it works. 

The pb with the Excel export is that we need to hold all data in memory. However 1540 items should not cause memory issues, I'll investigate. Could you please tell me your heap size settings before and after increasing it?

Thanks,
Stefan



> Add support for Paged Results Control
> -------------------------------------
>
>                 Key: DIRSTUDIO-327
>                 URL: https://issues.apache.org/jira/browse/DIRSTUDIO-327
>             Project: Directory Studio
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>         Environment: Win2k workstation, java 1.6.0_05
>            Reporter: Jim Birch
>            Assignee: Stefan Seelmann
>             Fix For: 1.3.0
>
>
> Windows servers have a default server side limit of 1000 returned objects.  AFAIK the normal way of handling this is to detect that a search returns an incomplete result set and make further requests to span the full result set.
> There appears to be no such capability in dirstudio which makes searching 15K users extremely messy.
>    

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (DIRSTUDIO-327) Add support for Paged Results Control

Posted by "Jim Birch (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DIRSTUDIO-327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627896#action_12627896 ] 

jimbirch edited comment on DIRSTUDIO-327 at 9/2/08 9:40 PM:
-------------------------------------------------------------

Thanks Stefan,

Browsing the structure works with no problems.

(CORRECTION) Searching is ok too, eg Needed to set paged search and bump the heap size

I also noticed that my Excel export slowed to a crawl then runs out of Java heap space after around 1540 items out of a 13K list.  There's a warning on that so to be expected.  I could bump the heap space but   CSV works ok - probably a better way to go anyway.

(MORE) After increasing heap space I got to about 20k entries on the Excel export before a heap space error.  C'est la Vie.

I'm happy for you to close this issue.  It's working for me.

Regards
Jim
 

      was (Author: jimbirch):
    Thanks Stefan,

Browsing the structure works with no problems.

I can't get more than 1000 results for a search.  Have I missed a setting somewhere, or do I get paged result only on browse? This isn't a showstopper.  I'm typically searching for a few entries that match something.

I also noticed that my Excel export slowed to a crawl then runs out of Java heap space after around 1540 items out of a 13K list.  There's a warning on that so to be expected.  I could bump the heap space but   CSV works ok - probably a better way to go anyway.

Regards
Jim
 
  
> Add support for Paged Results Control
> -------------------------------------
>
>                 Key: DIRSTUDIO-327
>                 URL: https://issues.apache.org/jira/browse/DIRSTUDIO-327
>             Project: Directory Studio
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>         Environment: Win2k workstation, java 1.6.0_05
>            Reporter: Jim Birch
>            Assignee: Stefan Seelmann
>             Fix For: 1.3.0
>
>
> Windows servers have a default server side limit of 1000 returned objects.  AFAIK the normal way of handling this is to detect that a search returns an incomplete result set and make further requests to span the full result set.
> There appears to be no such capability in dirstudio which makes searching 15K users extremely messy.
>    

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (DIRSTUDIO-327) Add support for Paged Results Control

Posted by "Stefan Seelmann (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DIRSTUDIO-327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stefan Seelmann updated DIRSTUDIO-327:
--------------------------------------

    Fix Version/s: 1.3.0
         Assignee: Stefan Seelmann
          Summary: Add support for Paged Results Control  (was: Repeated searches for complete result set.)

Hi Jim, I ust added support for paged results control. Feel free to test the nightly build and report if it fits your needs or if you need improvements.


> Add support for Paged Results Control
> -------------------------------------
>
>                 Key: DIRSTUDIO-327
>                 URL: https://issues.apache.org/jira/browse/DIRSTUDIO-327
>             Project: Directory Studio
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>         Environment: Win2k workstation, java 1.6.0_05
>            Reporter: Jim Birch
>            Assignee: Stefan Seelmann
>             Fix For: 1.3.0
>
>
> Windows servers have a default server side limit of 1000 returned objects.  AFAIK the normal way of handling this is to detect that a search returns an incomplete result set and make further requests to span the full result set.
> There appears to be no such capability in dirstudio which makes searching 15K users extremely messy.
>    

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (DIRSTUDIO-327) Add support for Paged Results Control

Posted by "Jim Birch (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DIRSTUDIO-327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627927#action_12627927 ] 

Jim Birch commented on DIRSTUDIO-327:
-------------------------------------

I'm on sun j2se 1.6.0_07 on Windows XP.  There's a doc on setting the heap size and other stuff here [http://directory.apache.org/studio/faqs.html] (covers Linux too.) Basically it's an ini file along side the DirStudio exe.

Is that what you want? 

     

> Add support for Paged Results Control
> -------------------------------------
>
>                 Key: DIRSTUDIO-327
>                 URL: https://issues.apache.org/jira/browse/DIRSTUDIO-327
>             Project: Directory Studio
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>         Environment: Win2k workstation, java 1.6.0_05
>            Reporter: Jim Birch
>            Assignee: Stefan Seelmann
>             Fix For: 1.3.0
>
>
> Windows servers have a default server side limit of 1000 returned objects.  AFAIK the normal way of handling this is to detect that a search returns an incomplete result set and make further requests to span the full result set.
> There appears to be no such capability in dirstudio which makes searching 15K users extremely messy.
>    

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (DIRSTUDIO-327) Add support for Paged Results Control

Posted by "Pierre-Arnaud Marcelot (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DIRSTUDIO-327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pierre-Arnaud Marcelot updated DIRSTUDIO-327:
---------------------------------------------

    Component/s: studio-ldapbrowser

> Add support for Paged Results Control
> -------------------------------------
>
>                 Key: DIRSTUDIO-327
>                 URL: https://issues.apache.org/jira/browse/DIRSTUDIO-327
>             Project: Directory Studio
>          Issue Type: Improvement
>          Components: studio-ldapbrowser
>    Affects Versions: 1.2.0
>         Environment: Win2k workstation, java 1.6.0_05
>            Reporter: Jim Birch
>            Assignee: Stefan Seelmann
>             Fix For: 1.3.0
>
>
> Windows servers have a default server side limit of 1000 returned objects.  AFAIK the normal way of handling this is to detect that a search returns an incomplete result set and make further requests to span the full result set.
> There appears to be no such capability in dirstudio which makes searching 15K users extremely messy.
>    

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.