You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2020/03/10 19:22:55 UTC

[GitHub] [lucene-solr] gerlowskija commented on a change in pull request #1332: SOLR-14254: Docs for text tagger: FST50 trade-off

gerlowskija commented on a change in pull request #1332: SOLR-14254: Docs for text tagger: FST50 trade-off
URL: https://github.com/apache/lucene-solr/pull/1332#discussion_r390556194
 
 

 ##########
 File path: solr/solr-ref-guide/src/the-tagger-handler.adoc
 ##########
 @@ -271,11 +271,12 @@ The response should be this (the QTime may vary):
   }}
 ----
 
-== Tagger Tips
+== Tagger Performance Tips
 
-Performance Tips:
-
-* Follow the recommended configuration field settings, especially `postingsFormat=FST50`.
+* Follow the recommended configuration field settings above.
+Additionally, for the best tagger performance, set `postingsFormat=FST50`.
+However, non-default postings formats have no backwards-compatibility guarantees, and so if you upgrade Solr then you may find a nasty exception on startup as it fails to read the older index.
+If the input text to be tagged is small (e.g. you are tagging queries or tweets) then the postings format choice isn't as important.
 
 Review comment:
   [Q] Interesting.  I didn't realize that the FST50 vs default performance decreased the smaller the individual document size was.  Did you do a particular performance test to bear this out, or are you just intuiting that behavior from knowing how postingsFormats work?
   
   Is the performance comparable even if numTweets or whatever gets large and the posting-lists grow due to the sheer number of tiny docs?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org