You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by kafka0102 <ka...@163.com> on 2010/10/20 13:21:27 UTC

why sorl is slower than lucene so much?

For solr's SolrIndexSearcher.search(QueryResult qr, QueryCommand cmd), I find it's too slowly.my index's size is about 500M, and record's num is 3984274.my query is like q=xx&fq=fid:1&fq=atm:[int_time1 TO int_time2].
fid's type is <fieldType name="int" class="solr.TrieIntField" precisionStep="0" omitNorms="true" positionIncrementGap="0"/>
atm's type is  <fieldType name="sint" class="solr.TrieIntField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/>.
for the test, I closed solr's cache's config and used another lucene's code like bottom:

 private void test2(final ResponseBuilder rb) {
    try {
      final SolrQueryRequest req = rb.req;
      final SolrIndexSearcher searcher = req.getSearcher();
      final SolrIndexSearcher.QueryCommand cmd = rb.getQueryCommand();
      final ExecuteTimeStatics timeStatics = ExecuteTimeStatics.getExecuteTimeStatics();
      final ExecuteTimeUnit staticUnit = timeStatics.addExecuteTimeUnit("test2");
      staticUnit.start();
      final List<Query> query = cmd.getFilterList();
      final BooleanQuery booleanFilter = new BooleanQuery();
      for (final Query q : query) {
        booleanFilter.add(new BooleanClause(q,Occur.MUST));
      }
      booleanFilter.add(new BooleanClause(cmd.getQuery(),Occur.MUST));
      logger.info("q:"+query);
      final Sort sort = cmd.getSort();
      final TopFieldDocs docs = searcher.search(booleanFilter,null,20,sort);
      final StringBuilder sbBuilder = new StringBuilder();
      for (final ScoreDoc doc :docs.scoreDocs) {
        sbBuilder.append(doc.doc+",");
      }
      logger.info("hits:"+docs.totalHits+",result:"+sbBuilder.toString());
      staticUnit.end();
    } catch (final Exception e) {
      throw new RuntimeException(e);
    }
  }

for the test, I first called above's code and then solr's search(...). The result is : lucence's about 20ms and solr's about 70ms.
I'm so confused.
And,I wrote another code using filter like bottom,but the range query's result num is not correct.
Can anybody knows the reasons?

  private void test1(final ResponseBuilder rb) {
    try {
      final SolrQueryRequest req = rb.req;
      final SolrIndexSearcher searcher = req.getSearcher();
      final SolrIndexSearcher.QueryCommand cmd = rb.getQueryCommand();
      final ExecuteTimeStatics timeStatics = ExecuteTimeStatics.getExecuteTimeStatics();
      final ExecuteTimeUnit staticUnit = timeStatics.addExecuteTimeUnit("test1");
      staticUnit.start();
      final List<Query> query = cmd.getFilterList();
      final BooleanFilter booleanFilter = new BooleanFilter();
      for (final Query q : query) {
        setFilter(booleanFilter,q);
      }
      final Sort sort = cmd.getSort();
      final TopFieldDocs docs = searcher.search(cmd.getQuery(),booleanFilter,20,sort);
      logger.info("hits:"+docs.totalHits);
     
      staticUnit.end();
    } catch (final Exception e) {
      throw new RuntimeException(e);
    }
  }

RE: Does anyone notice this site?

Posted by Eric Martin <er...@makethembite.com>.

This is not legal advice. Take this as it is. Just off my head and what I
know. I did not research this, but could, if Solr wants me to.

>From a marketing stand-point, probably. 

>From a legal standpoint. They can do whatever they want with the name Solr
so long as they maintain a distance between any trademarked name and the
fundamental use of the trademark, unless there is  substantial connection
between the trademark name and recognition. Of course, that is to be
determined by a few factors, length in business, trademarks carried, whether
or not the offending trademark makes a claim (not making a claim limits your
recovery substantially and may even null it.). They are also in South
Africa. So, throw in international law.

Of course, you also have fair use law. Well, this can get tricky. Here is an
example: myspace.com and moremyspace.com. If moremysapce.com is used as a
social networking site than myspace has a claim. If it is used as a social
networking site in parody then mysapce has no legal claim whatsoever.

Another example is booble.com (not work safe link!) That case lasted many
years and google lost. 

Trademarks are a very tricky business and one that I will never practice.
Anyway, seeing as how they are making a search engine, they are using a
lower level FQDN and they have not made a dent in the industry it would be
futile to do anything but send them an email laying cliam to the name Solr.

*If you do not send them a letter/email laying claim to Solr you will lose
your rights to fight that battle with IANA, etc or the ability to seek legal
remedy.*

Eric
Law Student - Second Year



-----Original Message-----
From: scott chu [mailto:scott.chu@udngroup.com] 
Sent: Monday, October 25, 2010 9:55 AM
To: solr-user@lucene.apache.org
Subject: Does anyone notice this site?

I happen to bump into this site: http://www.solr.biz/

They said they are also developing a search engine? Is this any connection 
to open source "Solr"?

Re: Does anyone notice this site?

Posted by Peter Keegan <pe...@gmail.com>.

fwiw, our proxy server has blocked this site for malicious content.

Peter

On Mon, Oct 25, 2010 at 1:25 PM, Grant Ingersoll <gs...@apache.org>wrote:

>
> On Oct 25, 2010, at 12:54 PM, scott chu wrote:
>
> > I happen to bump into this site: http://www.solr.biz/
> >
> > They said they are also developing a search engine? Is this any
> connection to open source "Solr"?
>
>
> No, it is not a connection and they likely should not be using the name
> that way, as Solr is a TM of the ASF.
>
>

Re: Does anyone notice this site?

Posted by Grant Ingersoll <gs...@apache.org>.

On Oct 25, 2010, at 12:54 PM, scott chu wrote:

> I happen to bump into this site: http://www.solr.biz/
> 
> They said they are also developing a search engine? Is this any connection to open source "Solr"? 

No, it is not a connection and they likely should not be using the name that way, as Solr is a TM of the ASF.

Does anyone notice this site?

Posted by scott chu <sc...@udngroup.com>.

I happen to bump into this site: http://www.solr.biz/

They said they are also developing a search engine? Is this any connection 
to open source "Solr"?

Re: why sorl is slower than lucene so much?

Posted by kafka0102 <ka...@163.com>.

thanks a lot.
I got it.

On 2010年10月21日 22:36, Yonik Seeley wrote:
> 2010/10/21 kafka0102<ka...@163.com>:
>> I found the problem's cause.It's the DocSetCollector. my fitler query result's size is about 3000000,so the DocSetCollector.getDocSet() is OpenBitSet. And 3000000 OpenBitSet.fastSet(doc) op is too slow.
>
> As I said in my other response to you, that's a perfect reason why you
> want Solr to cache that for you (unless the filter will be different
> each time).
>
> -Yonik
> http://www.lucidimagination.com

Re: why sorl is slower than lucene so much?

Posted by Yonik Seeley <yo...@lucidimagination.com>.

2010/10/21 kafka0102 <ka...@163.com>:
> I found the problem's cause.It's the DocSetCollector. my fitler query result's size is about 3000000,so the DocSetCollector.getDocSet() is OpenBitSet. And 3000000 OpenBitSet.fastSet(doc) op is too slow.


As I said in my other response to you, that's a perfect reason why you
want Solr to cache that for you (unless the filter will be different
each time).

-Yonik
http://www.lucidimagination.com

Re:why sorl is slower than lucene so much?

Posted by kafka0102 <ka...@163.com>.

I found the problem's cause.It's the DocSetCollector. my fitler query result's size is about 3000000,so the DocSetCollector.getDocSet() is OpenBitSet. And 3000000 OpenBitSet.fastSet(doc) op is too slow. So I used SolrIndexSearcher's TopFieldDocs search(Query query, Filter filter, int n,
                    Sort sort), and it's normal.




At 2010-10-20 19:21:27，kafka0102 <ka...@163.com> wrote:

>For solr's SolrIndexSearcher.search(QueryResult qr, QueryCommand cmd), I find it's too slowly.my index's size is about 500M, and record's num is 3984274.my query is like q=xx&fq=fid:1&fq=atm:[int_time1 TO int_time2].
>fid's type is <fieldType name="int" class="solr.TrieIntField" precisionStep="0" omitNorms="true" positionIncrementGap="0"/>
>atm's type is  <fieldType name="sint" class="solr.TrieIntField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/>.
>for the test, I closed solr's cache's config and used another lucene's code like bottom:
>
> private void test2(final ResponseBuilder rb) {
>    try {
>      final SolrQueryRequest req = rb.req;
>      final SolrIndexSearcher searcher = req.getSearcher();
>      final SolrIndexSearcher.QueryCommand cmd = rb.getQueryCommand();
>      final ExecuteTimeStatics timeStatics = ExecuteTimeStatics.getExecuteTimeStatics();
>      final ExecuteTimeUnit staticUnit = timeStatics.addExecuteTimeUnit("test2");
>      staticUnit.start();
>      final List<Query> query = cmd.getFilterList();
>      final BooleanQuery booleanFilter = new BooleanQuery();
>      for (final Query q : query) {
>        booleanFilter.add(new BooleanClause(q,Occur.MUST));
>      }
>      booleanFilter.add(new BooleanClause(cmd.getQuery(),Occur.MUST));
>      logger.info("q:"+query);
>      final Sort sort = cmd.getSort();
>      final TopFieldDocs docs = searcher.search(booleanFilter,null,20,sort);
>      final StringBuilder sbBuilder = new StringBuilder();
>      for (final ScoreDoc doc :docs.scoreDocs) {
>        sbBuilder.append(doc.doc+",");
>      }
>      logger.info("hits:"+docs.totalHits+",result:"+sbBuilder.toString());
>      staticUnit.end();
>    } catch (final Exception e) {
>      throw new RuntimeException(e);
>    }
>  }
>
>for the test, I first called above's code and then solr's search(...). The result is : lucence's about 20ms and solr's about 70ms.
>I'm so confused.
>And,I wrote another code using filter like bottom,but the range query's result num is not correct.
>Can anybody knows the reasons?
>
>  private void test1(final ResponseBuilder rb) {
>    try {
>      final SolrQueryRequest req = rb.req;
>      final SolrIndexSearcher searcher = req.getSearcher();
>      final SolrIndexSearcher.QueryCommand cmd = rb.getQueryCommand();
>      final ExecuteTimeStatics timeStatics = ExecuteTimeStatics.getExecuteTimeStatics();
>      final ExecuteTimeUnit staticUnit = timeStatics.addExecuteTimeUnit("test1");
>      staticUnit.start();
>      final List<Query> query = cmd.getFilterList();
>      final BooleanFilter booleanFilter = new BooleanFilter();
>      for (final Query q : query) {
>        setFilter(booleanFilter,q);
>      }
>      final Sort sort = cmd.getSort();
>      final TopFieldDocs docs = searcher.search(cmd.getQuery(),booleanFilter,20,sort);
>      logger.info("hits:"+docs.totalHits);
>     
>      staticUnit.end();
>    } catch (final Exception e) {
>      throw new RuntimeException(e);
>    }
>  }
>
>