You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Hoss Man (JIRA)" <ji...@apache.org> on 2013/03/15 19:28:14 UTC
[jira] [Updated] (SOLR-4589) 4.x + enableLazyFieldLoading + large
nultivalued fields + varying fl = pathalogical CPU load & response time
[ https://issues.apache.org/jira/browse/SOLR-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hoss Man updated SOLR-4589:
---------------------------
Attachment: test-just-queries.sh
test-just-queries.out__4.0.0_mmap_lazy_using36index.txt
test.sh
test.out__4.2.0_nio_nolazy.txt
test.out__4.2.0_nio_lazy.txt
test.out__4.2.0_mmap_nolazy.txt
test.out__4.2.0_mmap_lazy.txt
test.out__4.0.0_nio_nolazy.txt
test.out__4.0.0_nio_lazy.txt
test.out__4.0.0_mmap_nolazy.txt
test.out__4.0.0_mmap_lazy.txt
test.out__3.6.1_nio_nolazy.txt
test.out__3.6.1_nio_lazy.txt
test.out__3.6.1_mmap_nolazy.txt
test.out__3.6.1_mmap_lazy.txt
The attached files include a test.sh script that:
* creates some data where fields have a large number of values
* loads the data into solr
* execs 2 queries for a single doc using two different fl options
* triggers a commit to flush caches
* execs the same two queries in a differnet order
Also attached are the raw results of running this script on my Thinkpad T430s against the example jetty & solr configs where the version of solr, lazyfield loading, and the directory impl were varried...
* version of solr
** 3.6.1
** 4.0.0
** 4.2.0
* lazy field loading:
** lazy: default example configs
** nolazy: perl -i -pe 's{<enableLazyFieldLoading>true}{<enableLazyFieldLoading>false}' solrconfig.xml
* directory impl:
** mmap: java -Dsolr.directoryFactory=solr.MMapDirectoryFactory -jar start.jar
** nio: java -Dsolr.directoryFactory=solr.NIOFSDirectoryFactory -jar start.jar
There was no apparent difference in the directory impl choosen, or between 4.0 and 4.2. Here's the summary results for 3.6 vs 4.0 using mmap...
|| step || 3.6 nolazy || 3.6 lazy || 4.0 nolazy || 4.0 lazy ||
| small fl | 0m0.308s | 0m0.998s | 0m0.260s | 0m0.202s |
| big fl | 0m0.178s | 0m0.263s | 0m0.084s | *16m15.735s* |
| commit | XXXXXXX | XXXXXXX | XXXXXXX | XXXXXXX |
| big fl | 0m0.157s | 0m0.118s | 0m0.218s | 0m0.133s |
| small fl | 0m0.036s | 0m0.035s | 0m0.049s | *3m2.814s* |
Also attached is also the results of a single test I did running Solr 4.0 pointed at the configs & index built with 3.6.1 to rule out codec changes: it behaved essentially the same as the 4.0 tests that built the index from scratch.
> 4.x + enableLazyFieldLoading + large nultivalued fields + varying fl = pathalogical CPU load & response time
> ------------------------------------------------------------------------------------------------------------
>
> Key: SOLR-4589
> URL: https://issues.apache.org/jira/browse/SOLR-4589
> Project: Solr
> Issue Type: Bug
> Affects Versions: 4.0, 4.1, 4.2
> Reporter: Hoss Man
> Attachments: test-just-queries.out__4.0.0_mmap_lazy_using36index.txt, test-just-queries.sh, test.out__3.6.1_mmap_lazy.txt, test.out__3.6.1_mmap_nolazy.txt, test.out__3.6.1_nio_lazy.txt, test.out__3.6.1_nio_nolazy.txt, test.out__4.0.0_mmap_lazy.txt, test.out__4.0.0_mmap_nolazy.txt, test.out__4.0.0_nio_lazy.txt, test.out__4.0.0_nio_nolazy.txt, test.out__4.2.0_mmap_lazy.txt, test.out__4.2.0_mmap_nolazy.txt, test.out__4.2.0_nio_lazy.txt, test.out__4.2.0_nio_nolazy.txt, test.sh
>
>
> Following up on a [user report of exterme CPU usage in 4.1|http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201302.mbox/%3C1362019882934-4043543.post@n3.nabble.com%3E], I've discovered that the following combination of factors can result in extreme CPU usage and excessively HTTP response times...
> * Solr 4.x (tested 3.6.1, 4.0.0, and 4.2.0)
> * enableLazyFieldLoading == true (included in example solrconfig.xml)
> * documents with a large number of values in multivalued fields (eg: tested ~10-15K values)
> * multiple requests returning the same doc with different "fl" lists
> I haven't dug into the route cause yet, but the essential observations is: if lazyloading is used in 4.x, then once a document has been fetched with an initial fl list X, subsequent requests for that document using a differnet fl list Y can be many orders of magnitute slower (while pegging the CPU) -- even if those same requests using fl Y uncached (or w/o lazy laoding) would be extremely fast.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org