You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@manifoldcf.apache.org by Julien Massiera <ju...@francelabs.com> on 2023/03/17 17:36:47 UTC

Control over number of processed documents per thread

Hi Karl

 

I was debugging a repository connector because I was disappointed with the
performance, and I noticed that the processDocuments method is called each
time with only 1 document identifier instead of a heap, although the seeding
phase has referenced 24k ids. What can explain that ? Can we have control
over the amount of documentIdentifiers passed per processDocuments thread ?
For instance, assuming we have the perfect number of documents that an API
can process at once, it would be very useful to be able to set it per
thread. 

 

Other thing, I also noticed that the seed phase and the cleanup phase seem
to process documents per group of 100/200 at a time, again, is it configured
somewhere, and can we have control over it ? 

 

Thanks,

Julien