You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Eks Dev (JIRA)" <ji...@apache.org> on 2006/09/04 23:03:24 UTC

[jira] Updated: (LUCENE-584) Decouple Filter from BitSet

     [ http://issues.apache.org/jira/browse/LUCENE-584?page=all ]

Eks Dev updated LUCENE-584:
---------------------------

    Attachment: Some Matchers.zip

Here are some Matcher implementations,

- OpenBitsMatcher- the same as the code Paul wrote for BitsMatcher, with replaced OpenBitSet instead 

-DenseOpenBitsMatcher  - Using solr BitSetIterator (for skipTo() to work, one method in BitSetIterator should become public)

Also attached one simple  test (just basic fuctionality) that also contains one dummy relative performance  test 

Perf. test simply iterates over different Matcher implementations  and measures ellapsed time (not including Matcher creation, pure forward scan to the end) for different set bit densities.

imho, this code is not sufficiantly tested nor commented, needs an hour or two.  

As expected, Yonik made this ButSetIterator really fast. What was surprise for me was OpenBitSet nextSetBit() comparing bad to the BitSet  (or I made some dummy mistake somewhere?)

> Decouple Filter from BitSet
> ---------------------------
>
>                 Key: LUCENE-584
>                 URL: http://issues.apache.org/jira/browse/LUCENE-584
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.0.1
>            Reporter: Peter Schäfer
>            Priority: Minor
>         Attachments: BitsMatcher.java, Filter-20060628.patch, HitCollector-20060628.patch, IndexSearcher-20060628.patch, MatchCollector.java, Matcher.java, Matcher20060830b.patch, Scorer-20060628.patch, Searchable-20060628.patch, Searcher-20060628.patch, Some Matchers.zip, SortedVIntList.java, TestSortedVIntList.java
>
>
> {code}
> package org.apache.lucene.search;
> public abstract class Filter implements java.io.Serializable 
> {
>   public abstract AbstractBitSet bits(IndexReader reader) throws IOException;
> }
> public interface AbstractBitSet 
> {
>   public boolean get(int index);
> }
> {code}
> It would be useful if the method =Filter.bits()= returned an abstract interface, instead of =java.util.BitSet=.
> Use case: there is a very large index, and, depending on the user's privileges, only a small portion of the index is actually visible.
> Sparsely populated =java.util.BitSet=s are not efficient and waste lots of memory. It would be desirable to have an alternative BitSet implementation with smaller memory footprint.
> Though it _is_ possibly to derive classes from =java.util.BitSet=, it was obviously not designed for that purpose.
> That's why I propose to use an interface instead. The default implementation could still delegate to =java.util.BitSet=.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: [jira] Updated: (LUCENE-584) Decouple Filter from BitSet

Posted by eks dev <ek...@yahoo.co.uk>.
"What's the point of using a sorted interval list for a category?"

Just terminology first to avoid misunderstanding :), category is "category field" that can take N valus

Now, the case I am facing goes as follows:

I have category field in 50Mio collection which has more or less uniform distribution of values, 
imagine 100 values (terms) each with 0.5Mio documents. when I sort index on it, bit vectors look like
00000000......1111111111....00000000

So, if I cache this in any other representation (Filter usage, no scoring contribution as typical for categories) this adds up in memory rather quickly. With interval list for Filter/Matcher I need 2 Ints to store cache entry per value/term and have really fast next()-SkipTo(). In total 200 Ints for sorted category field with 100 values....

Not an unusual case I guess, a lot of people can sort their indexes, I guess at least.

Of course, it is a bit simplified, as we should not forget dynamic nature of the index. In real life, you can afford to sort your index once per month or once per week, but small percentage of updates do not introduce significant problems for interval lists.

Does this make sense to you, or I misread your aproach?


What woud be sexy to do, is to use such representation for postings storage, so each postings list would have type field that determines optimal bit vector representation for the term (high level, did not look into it yet in details). So instead of VInts, we could have interval lists for such cases. This would be  equivalent to RLE  encoding... which can be done fast I guess. This field type could be determined in some kind of *oprional* post-optimize proces that scanns postings and tries to find optimal "compresion", only VInt, VInt with rle for "predominantly sorted" postings....  ANd this is not 
limited to rle only, I could imagine some smart people to come up with even smarter things...


It looks like good idea to me, but I would not be surprised if somebody proves that I missed the point totaly :)



 







---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: [jira] Updated: (LUCENE-584) Decouple Filter from BitSet

Posted by Paul Elschot <pa...@xs4all.nl>.
On Thursday 07 September 2006 10:03, eks dev wrote:
...
> 
> on the other note,
> the key for really efficiant matchers will be good SmartMatcherFactory that 
picks the best representation for given density/"sortednes". 
> The cases I've been able to identify so far:
> 
> Very Low density - IntList
> Low density VIntSortedList
> Dense - OpenBitSet/BitSetIterator or such
> Sorted - (imagine case where you have an oportunity to sort your index on 
category field, quite offten I guess as it does not require absolute 
"sortedness", it is enough to sort periodicly without caring for smaller 
updates). There, one simple interval list can do the magic in just a few 
bytes of memory, even in high density cases. 
...
> More ideas on this?

What's the point of using a sorted interval list for a category?
With the patch, a TermScorer is a Matcher, so one could
use a TermScorer to filter a category assuming the category
has an indexed term.
For filtering, it might be worthwhile to introduce a TermMatcher
to avoid the scoring done in TermScorer.

Regards,
Paul Elschot

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: [jira] Updated: (LUCENE-584) Decouple Filter from BitSet

Posted by eks dev <ek...@yahoo.co.uk>.
As suspected in my first email on this topic, test is wrong, garbage collection was kicking in at wrong times so the results were convoluted. Ah, old good predictable c times are long gone. 

Anyhow, appologies for this noise (particulary to Yonik). The real speed relationships are attached now, everything looks as it should be, BitSetIterator is slower than OpenBitSet in skipTo() scenario...

 



----- Original Message ----
From: eks dev <ek...@yahoo.co.uk>
To: java-dev@lucene.apache.org
Sent: Thursday, 7 September, 2006 10:03:45 AM
Subject: Re: [jira] Updated: (LUCENE-584) Decouple Filter from BitSet

The same effect showed-up on athlon CPU... The only diference between your PerfTest and what I have been doing is that your loop is as tight as it goes. 
*Maybe* worth of investigating a bit futher as my tests look a bit closer to the real usage scenarios.
 
I will try to identify cases where OpenBitSet nextSetBit starts being slower ... Loks a lot like what you said, ntz() vs Long.numberOfTrailingZeros(). The second one is (I guess) native call, so optimizations from ntz() go in and out of some cache (too deep for me anyhow, just shooting in the dark). 

Good news, inspite of your doubts, BitSetIterator rocks in skipTo scenario! VInt as well, even in  medium density cases (memory profitability border is now the only condidtion for its usage).

on the other note,
the key for really efficiant matchers will be good SmartMatcherFactory that picks the best representation for given density/"sortednes". 
The cases I've been able to identify so far:

Very Low density - IntList
Low density VIntSortedList
Dense - OpenBitSet/BitSetIterator or such
Sorted - (imagine case where you have an oportunity to sort your index on category field, quite offten I guess as it does not require absolute "sortedness", it is enough to sort periodicly without caring for smaller updates). There, one simple interval list can do the magic in just a few bytes of memory, even in high density cases. 
 
Finding good strategy to find optimal Matcher representation is not all that hard, but really deciding when to do it will be a bit trickier (in warm-up case is easy, but in dynamic LRU case it gets complicated as you need as the first efficiently to collect bits in some hits collector, and than transform it to the optimal representation (SmartBitSet), quickly as otherwise the benefit of LRU caching will quickly get lost on time spant updateing cache...)

More ideas on this?

----- Original Message ----
From: Yonik Seeley <yo...@apache.org>
To: java-dev@lucene.apache.org
Sent: Wednesday, 6 September, 2006 11:54:14 PM
Subject: Re: [jira] Updated: (LUCENE-584) Decouple Filter from BitSet

On 9/6/06, eks dev <ek...@yahoo.co.uk> wrote:
> still, DenseBitsMatcher (BitSetIterator warpped in Matcher) works faster than anything else for this case:
>
>  int skip(Matcher m) throws IOException{
>      int doc=-1, ret = 0;
>      while(m.skipTo(doc+1)){
>         doc = m.doc();
>         ret+=doc;
>       }
>      return ret;
>   }

Huh... that doesn't make much sense.
Maybe it's your pentium-M with it's shorter pipeline and less penalty
for a pipeline flush on a branch prediction miss.  You could try
substituting BitUtil.ntz2 for ntz in OpenBitSet.nextSetBit... it uses
a full binary search down to the byte level, and then an array lookup.

The nice thing about being *open* is that you can write more than one
type of iterator if you really want to:-)

-Yonik

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org





---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org





Re: [jira] Updated: (LUCENE-584) Decouple Filter from BitSet

Posted by eks dev <ek...@yahoo.co.uk>.
The same effect showed-up on athlon CPU... The only diference between your PerfTest and what I have been doing is that your loop is as tight as it goes. 
*Maybe* worth of investigating a bit futher as my tests look a bit closer to the real usage scenarios.
 
I will try to identify cases where OpenBitSet nextSetBit starts being slower ... Loks a lot like what you said, ntz() vs Long.numberOfTrailingZeros(). The second one is (I guess) native call, so optimizations from ntz() go in and out of some cache (too deep for me anyhow, just shooting in the dark). 

Good news, inspite of your doubts, BitSetIterator rocks in skipTo scenario! VInt as well, even in  medium density cases (memory profitability border is now the only condidtion for its usage).

on the other note,
the key for really efficiant matchers will be good SmartMatcherFactory that picks the best representation for given density/"sortednes". 
The cases I've been able to identify so far:

Very Low density - IntList
Low density VIntSortedList
Dense - OpenBitSet/BitSetIterator or such
Sorted - (imagine case where you have an oportunity to sort your index on category field, quite offten I guess as it does not require absolute "sortedness", it is enough to sort periodicly without caring for smaller updates). There, one simple interval list can do the magic in just a few bytes of memory, even in high density cases. 
 
Finding good strategy to find optimal Matcher representation is not all that hard, but really deciding when to do it will be a bit trickier (in warm-up case is easy, but in dynamic LRU case it gets complicated as you need as the first efficiently to collect bits in some hits collector, and than transform it to the optimal representation (SmartBitSet), quickly as otherwise the benefit of LRU caching will quickly get lost on time spant updateing cache...)

More ideas on this?

----- Original Message ----
From: Yonik Seeley <yo...@apache.org>
To: java-dev@lucene.apache.org
Sent: Wednesday, 6 September, 2006 11:54:14 PM
Subject: Re: [jira] Updated: (LUCENE-584) Decouple Filter from BitSet

On 9/6/06, eks dev <ek...@yahoo.co.uk> wrote:
> still, DenseBitsMatcher (BitSetIterator warpped in Matcher) works faster than anything else for this case:
>
>  int skip(Matcher m) throws IOException{
>      int doc=-1, ret = 0;
>      while(m.skipTo(doc+1)){
>         doc = m.doc();
>         ret+=doc;
>       }
>      return ret;
>   }

Huh... that doesn't make much sense.
Maybe it's your pentium-M with it's shorter pipeline and less penalty
for a pipeline flush on a branch prediction miss.  You could try
substituting BitUtil.ntz2 for ntz in OpenBitSet.nextSetBit... it uses
a full binary search down to the byte level, and then an array lookup.

The nice thing about being *open* is that you can write more than one
type of iterator if you really want to:-)

-Yonik

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org





---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: [jira] Updated: (LUCENE-584) Decouple Filter from BitSet

Posted by Yonik Seeley <yo...@apache.org>.
On 9/6/06, eks dev <ek...@yahoo.co.uk> wrote:
> still, DenseBitsMatcher (BitSetIterator warpped in Matcher) works faster than anything else for this case:
>
>  int skip(Matcher m) throws IOException{
>      int doc=-1, ret = 0;
>      while(m.skipTo(doc+1)){
>         doc = m.doc();
>         ret+=doc;
>       }
>      return ret;
>   }

Huh... that doesn't make much sense.
Maybe it's your pentium-M with it's shorter pipeline and less penalty
for a pipeline flush on a branch prediction miss.  You could try
substituting BitUtil.ntz2 for ntz in OpenBitSet.nextSetBit... it uses
a full binary search down to the byte level, and then an array lookup.

The nice thing about being *open* is that you can write more than one
type of iterator if you really want to:-)

-Yonik

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: [jira] Updated: (LUCENE-584) Decouple Filter from BitSet

Posted by eks dev <ek...@yahoo.co.uk>.
"Keep in mind that BitSetIterator is fast for iteration over all it's bits.
If it's used as a filter (with skipping), I would expect it to be slower."

still, DenseBitsMatcher (BitSetIterator warpped in Matcher) works faster than anything else for this case:

 int skip(Matcher m) throws IOException{
     int doc=-1, ret = 0; 
     while(m.skipTo(doc+1)){
        doc = m.doc(); 
        ret+=doc;
      }
     return ret;
  }


for sparse bit sets, VInt thingy works the best 




Re: [jira] Updated: (LUCENE-584) Decouple Filter from BitSet

Posted by Paul Elschot <pa...@xs4all.nl>.
There is another aspect of this influences future performance:

When a Matcher is used to implement filtering, it can also be pushed down
into a boolean query as a required "clause". It would then end up
being called in the tight loop of ConjunctionScorer as one of the
required "clauses", instead of only being called after returning from 
the Scorer for the query.

This pushing down is independent of the actual Matcher, but I think
it would make differences between Matchers more visible.

Regards,
Paul Elschot

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: [jira] Updated: (LUCENE-584) Decouple Filter from BitSet

Posted by eks dev <ek...@yahoo.co.uk>.
Wierd indeed! I tend to beleive I made something stupid in Matcher test, but it looks too simple (OpenBitsMatcher and BitsMatcher are identical?!). I even itterated the same test many times, and there is no big difference   in results, speed curve is only not so jittery like on single itteration...

humpf,  wild guess out of my league: maybe  some instruction cache  in cpu in slightly more complex execution path breaks your optimization.?



With your test it is rather clear who wins:

-Xmx128m -server -Xbatch 8000000 5 7500000 nextSetBit 10 bit
ret=-1704152202
TIME=10996

-Xmx128m -server -Xbatch 8000000 5 7500000 nextSetBit 10 open
ret=-1704152202
TIME=8092

Jvm 32 bit "build 1.6.0-rc-b94"
 CPU Intel Pentium M 1.60GHz notebook
 

----- Original Message ----
From: Yonik Seeley <yo...@apache.org>
To: java-dev@lucene.apache.org
Sent: Tuesday, 5 September, 2006 11:15:42 PM
Subject: Re: [jira] Updated: (LUCENE-584) Decouple Filter from BitSet

On 9/4/06, Eks Dev (JIRA) <ji...@apache.org> wrote:

> Here are some Matcher implementations,
>
> - OpenBitsMatcher- the same as the code Paul wrote for BitsMatcher, with replaced OpenBitSet instead
>
> -DenseOpenBitsMatcher  - Using solr BitSetIterator (for skipTo() to work, one method in BitSetIterator should become public)

Keep in mind that BitSetIterator is fast for iteration over all it's bits.
If it's used as a filter (with skipping), I would expect it to be slower.

> Also attached one simple  test (just basic fuctionality) that also contains one dummy relative performance  test
>
> Perf. test simply iterates over different Matcher implementations  and measures ellapsed time (not including Matcher creation, pure forward scan to the end) for different set bit densities.
>
> imho, this code is not sufficiantly tested nor commented, needs an hour or two.
>
> As expected, Yonik made this BitSetIterator really fast. What was surprise for me was OpenBitSet nextSetBit() comparing bad to the BitSet  (or I made some dummy mistake somewhere?)

That's weird... what CPU and in what mode (32 or 64 bit)?  What JVM params?
Do you get the same results from org.apache.solr.util.BitSetPerf for nextSetBit?

I did write it on a P4, and ntz (number of trailing zeros) is
currently optimized for a 32 bit CPU where a 64 bit shift may be
slightly more expensive.

-Yonik

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org





---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: [jira] Updated: (LUCENE-584) Decouple Filter from BitSet

Posted by Yonik Seeley <yo...@apache.org>.
On 9/4/06, Eks Dev (JIRA) <ji...@apache.org> wrote:

> Here are some Matcher implementations,
>
> - OpenBitsMatcher- the same as the code Paul wrote for BitsMatcher, with replaced OpenBitSet instead
>
> -DenseOpenBitsMatcher  - Using solr BitSetIterator (for skipTo() to work, one method in BitSetIterator should become public)

Keep in mind that BitSetIterator is fast for iteration over all it's bits.
If it's used as a filter (with skipping), I would expect it to be slower.

> Also attached one simple  test (just basic fuctionality) that also contains one dummy relative performance  test
>
> Perf. test simply iterates over different Matcher implementations  and measures ellapsed time (not including Matcher creation, pure forward scan to the end) for different set bit densities.
>
> imho, this code is not sufficiantly tested nor commented, needs an hour or two.
>
> As expected, Yonik made this BitSetIterator really fast. What was surprise for me was OpenBitSet nextSetBit() comparing bad to the BitSet  (or I made some dummy mistake somewhere?)

That's weird... what CPU and in what mode (32 or 64 bit)?  What JVM params?
Do you get the same results from org.apache.solr.util.BitSetPerf for nextSetBit?

I did write it on a P4, and ntz (number of trailing zeros) is
currently optimized for a 32 bit CPU where a 64 bit shift may be
slightly more expensive.

-Yonik

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org