You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Hoss Man (JIRA)" <ji...@apache.org> on 2013/03/15 19:28:14 UTC

[jira] [Updated] (SOLR-4589) 4.x + enableLazyFieldLoading + large nultivalued fields + varying fl = pathalogical CPU load & response time

     [ https://issues.apache.org/jira/browse/SOLR-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hoss Man updated SOLR-4589:
---------------------------

    Attachment: test-just-queries.sh
                test-just-queries.out__4.0.0_mmap_lazy_using36index.txt
                test.sh
                test.out__4.2.0_nio_nolazy.txt
                test.out__4.2.0_nio_lazy.txt
                test.out__4.2.0_mmap_nolazy.txt
                test.out__4.2.0_mmap_lazy.txt
                test.out__4.0.0_nio_nolazy.txt
                test.out__4.0.0_nio_lazy.txt
                test.out__4.0.0_mmap_nolazy.txt
                test.out__4.0.0_mmap_lazy.txt
                test.out__3.6.1_nio_nolazy.txt
                test.out__3.6.1_nio_lazy.txt
                test.out__3.6.1_mmap_nolazy.txt
                test.out__3.6.1_mmap_lazy.txt


The attached files include a test.sh script that:
* creates some data where fields have a large number of values
* loads the data into solr
* execs 2 queries for a single doc using two different fl options
* triggers a commit to flush caches
* execs the same two queries in a differnet order

Also attached are the raw results of running this script on my Thinkpad T430s against the example jetty & solr configs where the version of solr, lazyfield loading, and the directory impl were varried...

* version of solr
** 3.6.1
** 4.0.0
** 4.2.0
* lazy field loading:
** lazy: default example configs
** nolazy: perl -i -pe 's{<enableLazyFieldLoading>true}{<enableLazyFieldLoading>false}' solrconfig.xml
* directory impl:
** mmap: java -Dsolr.directoryFactory=solr.MMapDirectoryFactory -jar start.jar
** nio: java -Dsolr.directoryFactory=solr.NIOFSDirectoryFactory -jar start.jar

There was no apparent difference in the directory impl choosen, or between 4.0 and 4.2.  Here's the summary results for 3.6 vs 4.0 using mmap...

|| step || 3.6 nolazy || 3.6 lazy || 4.0 nolazy || 4.0 lazy ||
| small fl | 0m0.308s | 0m0.998s | 0m0.260s | 0m0.202s | 
| big fl | 0m0.178s | 0m0.263s | 0m0.084s | *16m15.735s* | 
| commit | XXXXXXX | XXXXXXX | XXXXXXX | XXXXXXX |
| big fl | 0m0.157s | 0m0.118s | 0m0.218s | 0m0.133s |
| small fl | 0m0.036s | 0m0.035s | 0m0.049s | *3m2.814s* |

Also attached is also the results of a single test I did running Solr 4.0 pointed at the configs & index built with 3.6.1 to rule out codec changes: it behaved essentially the same as the 4.0 tests that built the index from scratch.

                
> 4.x + enableLazyFieldLoading + large nultivalued fields + varying fl = pathalogical CPU load & response time
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-4589
>                 URL: https://issues.apache.org/jira/browse/SOLR-4589
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 4.0, 4.1, 4.2
>            Reporter: Hoss Man
>         Attachments: test-just-queries.out__4.0.0_mmap_lazy_using36index.txt, test-just-queries.sh, test.out__3.6.1_mmap_lazy.txt, test.out__3.6.1_mmap_nolazy.txt, test.out__3.6.1_nio_lazy.txt, test.out__3.6.1_nio_nolazy.txt, test.out__4.0.0_mmap_lazy.txt, test.out__4.0.0_mmap_nolazy.txt, test.out__4.0.0_nio_lazy.txt, test.out__4.0.0_nio_nolazy.txt, test.out__4.2.0_mmap_lazy.txt, test.out__4.2.0_mmap_nolazy.txt, test.out__4.2.0_nio_lazy.txt, test.out__4.2.0_nio_nolazy.txt, test.sh
>
>
> Following up on a [user report of exterme CPU usage in 4.1|http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201302.mbox/%3C1362019882934-4043543.post@n3.nabble.com%3E], I've discovered that the following combination of factors can result in extreme CPU usage and excessively HTTP response times...
> * Solr 4.x (tested 3.6.1, 4.0.0, and 4.2.0)
> * enableLazyFieldLoading == true (included in example solrconfig.xml)
> * documents with a large number of values in multivalued fields (eg: tested ~10-15K values)
> * multiple requests returning the same doc with different "fl" lists
> I haven't dug into the route cause yet, but the essential observations is: if lazyloading is used in 4.x, then once a document has been fetched with an initial fl list X, subsequent requests for that document using a differnet fl list Y can be many orders of magnitute slower (while pegging the CPU) -- even if those same requests using fl Y uncached (or w/o lazy laoding) would be extremely fast.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org