You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@directory.apache.org by Emmanuel Lécharny <el...@gmail.com> on 2015/03/03 19:50:28 UTC

Re: [Mavibot] Status

A quick update...

I was not able to work too much on mavibot those last two weeks. Still,
some progress has ben made last week-end :

- first, DIRSERVER_2047 has been investigated deeply. It was really
problematic, as the browsefrom(K) method was producing various NPE and
wrong result. It took long to get it fixed. We have to thank Lin Zhao
for having been patient and helpful to get the pb fixed

- second, I ran a quick performance test to see what's going on.
Browsing a B-tree with 500 keys, each one of them having 9 values (which
resolves to a sub-btree), after having positionned the cursor before a
key from 0 to 499 took 9 seconds, and 23 seconds when browsing backward.
It was a bit slow, considering we are fetching 500 * 250 tuples (500
iterations, for an average of half the tuples being fetched). That was
around 14 000 tuples get per second.

So I profiled the test. I as surprised to see that we were deserializing
many many keys and values, which was not expected, as we have a cache.
After a bit of analysis, I saw that we were missing the cache many
times. There was one reason for that : the cache size was set to 1 !!!
As soon as I set it to the default value (ie 1000), it took only 2.7
seconds to browse forward and 3.8 to browse backward, with no cache miss !

Those two changes make me think we should release now. The next steps
will be to move back the CPB btree, and manage the free pages.

More to come later ...


Re: [Mavibot] Status

Posted by Emmanuel Lécharny <el...@gmail.com>.
Le 03/03/15 19:50, Emmanuel Lécharny a écrit :
> A quick update...
>
> I was not able to work too much on mavibot those last two weeks. Still,
> some progress has ben made last week-end :
>
> - first, DIRSERVER_2047 has been investigated deeply. It was really
> problematic, as the browsefrom(K) method was producing various NPE and
> wrong result. It took long to get it fixed. We have to thank Lin Zhao
> for having been patient and helpful to get the pb fixed
>
> - second, I ran a quick performance test to see what's going on.
> Browsing a B-tree with 500 keys, each one of them having 9 values (which
> resolves to a sub-btree), after having positionned the cursor before a
> key from 0 to 499 took 9 seconds, and 23 seconds when browsing backward.
> It was a bit slow, considering we are fetching 500 * 250 tuples (500
> iterations, for an average of half the tuples being fetched). That was
> around 14 000 tuples get per second.
Correction : we are talking about browsing 500 * 250 * 9 tuples here,
making it able of browsing 125 000 tuples per second (before the fix),
as we have 9 values per key, thus each key will produces 9 tuples.

That makes the Browse operation even faster after the cache fix, with
around 450 000 tuples fetched per second...


Re: [Mavibot] Status

Posted by Emmanuel Lécharny <el...@gmail.com>.
Le 03/03/15 19:50, Emmanuel Lécharny a écrit :
> A quick update...
>
> I was not able to work too much on mavibot those last two weeks. Still,
> some progress has ben made last week-end :
>
> - first, DIRSERVER_2047 has been investigated deeply. It was really
> problematic, as the browsefrom(K) method was producing various NPE and
> wrong result. It took long to get it fixed. We have to thank Lin Zhao
> for having been patient and helpful to get the pb fixed
>
> - second, I ran a quick performance test to see what's going on.
> Browsing a B-tree with 500 keys, each one of them having 9 values (which
> resolves to a sub-btree), after having positionned the cursor before a
> key from 0 to 499 took 9 seconds, and 23 seconds when browsing backward.
> It was a bit slow, considering we are fetching 500 * 250 tuples (500
> iterations, for an average of half the tuples being fetched). That was
> around 14 000 tuples get per second.
>
> So I profiled the test. I as surprised to see that we were deserializing
> many many keys and values, which was not expected, as we have a cache.
> After a bit of analysis, I saw that we were missing the cache many
> times. There was one reason for that : the cache size was set to 1 !!!
> As soon as I set it to the default value (ie 1000), it took only 2.7
> seconds to browse forward and 3.8 to browse backward, with no cache miss !
>
> Those two changes make me think we should release now. The next steps
> will be to move back the CPB btree, and manage the free pages.
>
> More to come later ...
>
Ok, some more, but bad news...

at some point, the file can get corrupted. Typically, the free page list
is cycling, leading to a quick OOM. This has to be investigated.

I suspect that concurrent reads and writes are causing this issue, which
would mean the btree is not protected against concurrent writes.