You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by Jörg Hoh <jh...@googlemail.com.INVALID> on 2018/10/11 09:03:20 UTC

Indexing performance degrading over time (RDB)

Hi,

On my Oak 1.8.2 (RDB) setup I observed that on index rebuilds the
performance seems to degrade over the time of this rebuild (logs trimmed
for brevity):

*INFO* [Apache Sling Repository Startup Thread] o.a.j.o.p.index.IndexUpdate
Reindexing will be performed for following indexes: [/oak:index/nodetype]
*INFO* [Apache Sling Repository Startup Thread] o.a.j.o.p.index.IndexUpdate
Reindexing Traversed #10000 <path> [1428,43 nodes/s, 5142342,86 nodes/hr]
*INFO* [Apache Sling Repository Startup Thread] o.a.j.o.p.index.IndexUpdate
Reindexing Traversed #20000 <path> [1666,58 nodes/s, 5999700,00 nodes/hr]
*INFO* [Apache Sling Repository Startup Thread] o.a.j.o.p.index.IndexUpdate
Reindexing Traversed #30000 <path> [1764,65 nodes/s, 6352729,41 nodes/hr]
*INFO* [Apache Sling Repository Startup Thread] o.a.j.o.p.index.IndexUpdate
Reindexing Traversed #40000 <path> [1904,71 nodes/s, 6856971,43 nodes/hr]
*INFO* [Apache Sling Repository Startup Thread] o.a.j.o.p.index.IndexUpdate
Reindexing Traversed #50000 <path> [1923,04 nodes/s, 6922938,46 nodes/hr]
*INFO* [Apache Sling Repository Startup Thread] o.a.j.o.p.index.IndexUpdate
Reindexing Traversed #60000 <path> [2068,93 nodes/s, 7448151,72 nodes/hr]
*INFO* [Apache Sling Repository Startup Thread] o.a.j.o.p.index.IndexUpdate
Reindexing Traversed #70000 <path> [2058,79 nodes/s, 7411658,82 nodes/hr]
*INFO* [Apache Sling Repository Startup Thread] o.a.j.o.p.index.IndexUpdate
Reindexing Traversed #80000 <path> [2105,24 nodes/s, 7578852,63 nodes/hr]
...
*INFO* [Apache Sling Repository Startup Thread] o.a.j.o.p.index.IndexUpdate
Reindexing Traversed #310000 <path> [1107,14 nodes/s, 3985701,43 nodes/hr]
*INFO* [Apache Sling Repository Startup Thread] o.a.j.o.p.index.IndexUpdate
Reindexing Traversed #320000 <path> [1066,66 nodes/s, 3839988,00 nodes/hr]
*INFO* [Apache Sling Repository Startup Thread] o.a.j.o.p.index.IndexUpdate
Reindexing Traversed #330000 <path> [1031,25 nodes/s, 3712488,75 nodes/hr]
*INFO* [Apache Sling Repository Startup Thread] o.a.j.o.p.index.IndexUpdate
Reindexing Traversed #340000 <path> [1000,00 nodes/s, 3599989,41 nodes/hr]
*INFO* [Apache Sling Repository Startup Thread] o.a.j.o.p.index.IndexUpdate
Reindexing Traversed #350000 <path> [974,93 nodes/s, 3509739,28 nodes/hr]
*INFO* [Apache Sling Repository Startup Thread] o.a.j.o.p.index.IndexUpdate
Reindexing Traversed #360000 <path> [939,95 nodes/s, 3383802,61 nodes/hr]
*INFO* [Apache Sling Repository Startup Thread] o.a.j.o.p.index.IndexUpdate
Reindexing Traversed #370000 <path> [918,11 nodes/s, 3305201,99 nodes/hr]
...
*INFO* [Apache Sling Repository Startup Thread] o.a.j.o.p.index.IndexUpdate
Reindexing Traversed #540000 <path> [700,39 nodes/s, 2521396,11 nodes/hr]
*INFO* [Apache Sling Repository Startup Thread] o.a.j.o.p.index.IndexUpdate
Reindexing Traversed #550000 <path> [697,08 nodes/s, 2509501,14 nodes/hr]
*INFO* [Apache Sling Repository Startup Thread] o.a.j.o.p.index.IndexUpdate
Reindexing Traversed #560000 <path> [691,36 nodes/s, 2488884,44 nodes/hr]
*INFO* [Apache Sling Repository Startup Thread] o.a.j.o.p.index.IndexUpdate
Reindexing Traversed #570000 <path> [686,75 nodes/s, 2472284,82 nodes/hr]
...
*INFO* [Apache Sling Repository Startup Thread] o.a.j.o.p.index.IndexUpdate
Reindexing Traversed #990000 <path> [603,66 nodes/s, 2173168,54 nodes/hr]
*INFO* [Apache Sling Repository Startup Thread] o.a.j.o.p.index.IndexUpdate
Reindexing Traversed #1000000 <path> [601,68 nodes/s, 2166062,82 nodes/hr]
*INFO* [Apache Sling Repository Startup Thread] o.a.j.o.p.index.IndexUpdate
Reindexing Traversed #1010000 <path> [597,99 nodes/s, 2152750,98 nodes/hr]
...
*INFO* [Apache Sling Repository Startup Thread] o.a.j.o.p.index.IndexUpdate
Reindexing Traversed #1460000 <path> [562,84 nodes/s, 2026212,95 nodes/hr]
*INFO* [Apache Sling Repository Startup Thread] o.a.j.o.p.index.IndexUpdate
Reindexing Traversed #1470000 <path> [562,14 nodes/s, 2023707,99 nodes/hr]
...
*INFO* [Apache Sling Repository Startup Thread] o.a.j.o.p.index.IndexUpdate
Reindexing Traversed #1870000 <path> [563,93 nodes/s, 2030155,73 nodes/hr]
*INFO* [Apache Sling Repository Startup Thread] o.a.j.o.p.index.IndexUpdate
Reindexing Traversed #1880000 <path> [563,38 nodes/s, 2028167,94 nodes/hr]
...
*INFO* [Apache Sling Repository Startup Thread] o.a.j.o.p.index.IndexUpdate
Reindexing Traversed #2350000 <path> [571,36 nodes/s, 2056891,90 nodes/hr]
*INFO* [Apache Sling Repository Startup Thread] o.a.j.o.p.index.IndexUpdate
Reindexing Traversed #2360000 <path> [570,46 nodes/s, 2053661,20 nodes/hr]
*INFO* [Apache Sling Repository Startup Thread] o.a.j.o.p.index.IndexUpdate
Reindexing Traversed #2370000 <path> [569,99 nodes/s, 2051947,19 nodes/hr]
*INFO* [Apache Sling Repository Startup Thread] o.a.j.o.p.index.IndexUpdate
Reindexing completed

For the about first 100k of nodes the performance (indexed nodes per
second) is increasing up to about 2200 nodes/s, but then the performance is
degrading until at around 1.2M nodes it's down at  560 nodes/s and stays
there.

I observe this at both property index and also with async lucene property
index, although the initial number is even higher there (it starts with
more than 6'000 nodes per second being index, degrading down to 400).

I want to understand if this is a random behavior or if others have
observed this as well. Also I would like to know if there are any knobs to
tune the indexing performance itself whilst staying on RDB (moving away is
not an option right now).

Thanks,
Jörg

-- 
http://cqdump.wordpress.com
Twitter: @joerghoh