You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by "Paul Elschot (JIRA)" <ji...@apache.org> on 2014/08/02 18:30:12 UTC

[jira] [Commented] (LUCENE-5861) CachingTokenFilter should use ArrayList not LinkedList

    [ https://issues.apache.org/jira/browse/LUCENE-5861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14083617#comment-14083617 ] 

Paul Elschot commented on LUCENE-5861:
--------------------------------------

TeeSinkTokenFilter.SinkTokenStream in the analysis comon module (o.a.l.analysis.sinks) uses a LinkedList, too.

I also prefer an ArrayList, but I used a LinkedList also in the PrefillTokenStream of LUCENE-5687 because the existing code uses it and I don't know of any existing performance tests for this.

To grow an ArrayList would it be good to use ArrayUtil.oversize() ?


> CachingTokenFilter should use ArrayList not LinkedList
> ------------------------------------------------------
>
>                 Key: LUCENE-5861
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5861
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/analysis
>            Reporter: David Smiley
>            Assignee: David Smiley
>            Priority: Minor
>
> CachingTokenFilter, to my surprise, puts each new AttributeSource.State onto a LinkedList.  I think it should be an ArrayList.  On large fields that get analyzed, there can be a ton of State objects to cache.
> I also observe that State is itself a linked list of other State objects.  Perhaps we could take this one step further and do parallel arrays of AttributeImpl, thereby bypassing State.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org