You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Adrien Grand (Jira)" <ji...@apache.org> on 2021/10/25 07:59:00 UTC
[jira] [Created] (LUCENE-10203) Improve reuse of StringTokenStream
Adrien Grand created LUCENE-10203:
-------------------------------------
Summary: Improve reuse of StringTokenStream
Key: LUCENE-10203
URL: https://issues.apache.org/jira/browse/LUCENE-10203
Project: Lucene - Core
Issue Type: Improvement
Reporter: Adrien Grand
This issue is a follow-up to https://lists.apache.org/thread.html/rdcc6bd085a0e8ac6db22a1ef7dd3228197481b62bec3c6fe4972e50a%40%3Cdev.lucene.apache.org%3E.
StringField has a different mechanism for reusing token streams compared to TextField: while TextField relies on {{Analyzer#reuseStrategy}} to reuse token streams across inputs, StringField relies on IndexingChain passing the previously produced token stream as the `reuse` parameter of {{IndexableField#tokenStream}}. However one downside of this approach is that it can only reuse token streams within a single segment. And some nightly profiles suggest that not reusing across segments still gives room for attribute initialization to be a hotspot.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org