You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Samarendra Pratap <sa...@gmail.com> on 2011/08/17 11:05:02 UTC

Solr 1.4.1 vs 3.3 (Speed)

Hi we are planning to migrate from solr 1.4.1 to solr 3.3 and I am doing a
manual performance comparison.

We have setup two different solr installations (1.4.1 and 3.3) on different
ports.
 1. Both have same index (old lucene format index) of around 20 GB with 10
million documents and 60 fields (40 fields with indexed="true").
 2. Both processes have  max 4GB memory allocated (-Xms2048m -Xmx4096m)
 3. Both installation are on same server (8 processor Intel(R) Core(TM) i7
CPU 930 @ 2.80GHz, 8GB RAM, 64 bit linux system)
 4. We are running solr 1.4.1 with collapsing patch
(SOLR-236-1_4_1.patch<https://issues.apache.org/jira/browse/SOLR-236>
).

 When I pass exactly similar query to both the servers one by one solr 1.4.1
is more efficient than solr 3.3.
 Before I convert the index into LUCENE_33 format I thought it would be good
to take the expert advice.

 Is there something which I should look into deeply? Or could this be effect
of old index format with new version and should be ignored?

 When I used "debugQuery=true", it clearly shows
that org.apache.solr.handler.component.CollapseComponent (solr 1.4.1)
noticeably taking less time
than org.apache.solr.handler.component.QueryComponent (solr 3.3).

 I am testing this against simple queries without any faceting,
highlighting, collapsing etc. (*
http://xxx.xxx:8983/solr/select/?q=Packaging%20Material,%20Supplies&qt=dismax&qf=category
^4.0&qf=keywords^2.0&qf=title^2.0&qf=smalldesc&qf=companyname&qf=usercategory&qf=usrpcatdesc&qf=city&qs=10&pf=category^4.0&pf=keywords^3&pf=title^3&pf=smalldesc^1.5&pf=companyname&pf=usercategory&pf=usrpcatdesc&pf=city&ps=0&bq=type:[149%20TO%201500]^3&start=0&rows=50&fl=title,smalldesc,id&debugQuery=true
*)

 Any insights by the experts would be greatly appreciated!

 Thanks in advance.

-- 
Regards,
Samar

RE: Solr 1.4.1 vs 3.3 (Speed)

Posted by "Jaeger, Jay - DOT" <Ja...@dot.wi.gov>.
It would perhaps help if you reported what you mean by "noticeably less time".  What were  your timings?  Did you run the tests multiple times?

One thing to watch for in testing:  Solr performance is greatly affected by the OS file system cache.  So make sure when testing that you use the same searches, and that you run your tests enough times (or not) so that the OS file system cache is populated (or not).

So if, for example, you ran your Solr 1.4 test against your production server (which would have the file system cache populated), but ran your Solr 3.3 test from  a "cold" start, you would indeed get very different search results.

That said, in my testing I have noticed that Solr 3.3 seems to be noticeably slower than Solr 3.2 which seems to be about the same as Solr 3.1 which was a little slower than Solr 1.4.  

So, I offer the test results below -- with the caveat that I didn't always record all the parameters of the test, and didn't always worry about having only one thing changing between tests -- my goal at the time was to confirm that performance was adequate for our particular need.  Also, in the WebSphere 7 tests, I probably also had myEclipse running (that I used to build an EAR to feed to WAS 7), so there was less file system cache available to it.

(In the table below, each thread runs 100 queries.  One query term (using a last and first name) was set to the specified fuzz)

Fuzz    Threads   Time/request   Rate/Hour    Release   Container
 0        4           0.16        90,000        1.4       Jetty
 0        4           0.38        37,894        3.1       Jetty

???       4           0.67        21,492        3.1       WebSphere 7   [Didn't record the fuzz factor, but think it was 0.50]
???       4           0.66        21,818        3.2       WebSphere 7   [But used the same one here.]
???       4           1.68        14,693        3.3       WebSphere 7   [And, I believe, the same one here]
 

0.80      4           0.28        56,470        1.4       Jetty
0.80      4           0.34        42,354        3.1       Jetty



-----Original Message-----
From: Samarendra Pratap [mailto:samarzone@gmail.com] 
Sent: Wednesday, August 17, 2011 4:05 AM
To: solr-user@lucene.apache.org
Subject: Solr 1.4.1 vs 3.3 (Speed)

Hi we are planning to migrate from solr 1.4.1 to solr 3.3 and I am doing a
manual performance comparison.

We have setup two different solr installations (1.4.1 and 3.3) on different
ports.
 1. Both have same index (old lucene format index) of around 20 GB with 10
million documents and 60 fields (40 fields with indexed="true").
 2. Both processes have  max 4GB memory allocated (-Xms2048m -Xmx4096m)
 3. Both installation are on same server (8 processor Intel(R) Core(TM) i7
CPU 930 @ 2.80GHz, 8GB RAM, 64 bit linux system)
 4. We are running solr 1.4.1 with collapsing patch
(SOLR-236-1_4_1.patch<https://issues.apache.org/jira/browse/SOLR-236>
).

 When I pass exactly similar query to both the servers one by one solr 1.4.1
is more efficient than solr 3.3.
 Before I convert the index into LUCENE_33 format I thought it would be good
to take the expert advice.

 Is there something which I should look into deeply? Or could this be effect
of old index format with new version and should be ignored?

 When I used "debugQuery=true", it clearly shows
that org.apache.solr.handler.component.CollapseComponent (solr 1.4.1)
noticeably taking less time
than org.apache.solr.handler.component.QueryComponent (solr 3.3).

 I am testing this against simple queries without any faceting,
highlighting, collapsing etc. (*
http://xxx.xxx:8983/solr/select/?q=Packaging%20Material,%20Supplies&qt=dismax&qf=category
^4.0&qf=keywords^2.0&qf=title^2.0&qf=smalldesc&qf=companyname&qf=usercategory&qf=usrpcatdesc&qf=city&qs=10&pf=category^4.0&pf=keywords^3&pf=title^3&pf=smalldesc^1.5&pf=companyname&pf=usercategory&pf=usrpcatdesc&pf=city&ps=0&bq=type:[149%20TO%201500]^3&start=0&rows=50&fl=title,smalldesc,id&debugQuery=true
*)

 Any insights by the experts would be greatly appreciated!

 Thanks in advance.

-- 
Regards,
Samar

Re: Solr 1.4.1 vs 3.3 (Speed)

Posted by Alexei Martchenko <al...@superdownloads.com.br>.
I'm doing the exact same migration... what I've accomplished so far

   1. In solrconfig.xml i
   put <luceneMatchVersion>LUCENE_33</luceneMatchVersion> in the first line in
   the <config> branch. Warnings go like crazy if you don't do that.
   2. Highlighter shows a deprecated warning, i'm still working on that. It
   works, but I'd like to use the new fastvectorhighlight wich i'm strugglin'
   to death right now
   3. All my speed measures are doing exact the same. sometimes we lose
   60ms, sometimes we gain 60ms, so it's about average. I'll rebuild the index
   from scratch to see differences maybe today or later this week
   4. Since i had to turned termVectors="true" termPositions="true"
   termOffsets="true" in 3 fileds to use fastvectorhighlight, i expect speed
   gains in HL


2011/8/17 Samarendra Pratap <sa...@gmail.com>

> Hi we are planning to migrate from solr 1.4.1 to solr 3.3 and I am doing a
> manual performance comparison.
>
> We have setup two different solr installations (1.4.1 and 3.3) on different
> ports.
>  1. Both have same index (old lucene format index) of around 20 GB with 10
> million documents and 60 fields (40 fields with indexed="true").
>  2. Both processes have  max 4GB memory allocated (-Xms2048m -Xmx4096m)
>  3. Both installation are on same server (8 processor Intel(R) Core(TM) i7
> CPU 930 @ 2.80GHz, 8GB RAM, 64 bit linux system)
>  4. We are running solr 1.4.1 with collapsing patch
> (SOLR-236-1_4_1.patch<https://issues.apache.org/jira/browse/SOLR-236>
> ).
>
>  When I pass exactly similar query to both the servers one by one solr
> 1.4.1
> is more efficient than solr 3.3.
>  Before I convert the index into LUCENE_33 format I thought it would be
> good
> to take the expert advice.
>
>  Is there something which I should look into deeply? Or could this be
> effect
> of old index format with new version and should be ignored?
>
>  When I used "debugQuery=true", it clearly shows
> that org.apache.solr.handler.component.CollapseComponent (solr 1.4.1)
> noticeably taking less time
> than org.apache.solr.handler.component.QueryComponent (solr 3.3).
>
>  I am testing this against simple queries without any faceting,
> highlighting, collapsing etc. (*
>
> http://xxx.xxx:8983/solr/select/?q=Packaging%20Material,%20Supplies&qt=dismax&qf=category
>
> ^4.0&qf=keywords^2.0&qf=title^2.0&qf=smalldesc&qf=companyname&qf=usercategory&qf=usrpcatdesc&qf=city&qs=10&pf=category^4.0&pf=keywords^3&pf=title^3&pf=smalldesc^1.5&pf=companyname&pf=usercategory&pf=usrpcatdesc&pf=city&ps=0&bq=type:[149%20TO%201500]^3&start=0&rows=50&fl=title,smalldesc,id&debugQuery=true
> *)
>
>  Any insights by the experts would be greatly appreciated!
>
>  Thanks in advance.
>
> --
> Regards,
> Samar
>



-- 

*Alexei Martchenko* | *CEO* | Superdownloads
alexei@superdownloads.com.br | alexei@martchenko.com.br | (11)
5083.1018/5080.3535/5080.3533