You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@jena.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2022/04/09 13:36:00 UTC

[jira] [Commented] (JENA-2319) Concurrency errors in text search when using explicit Analyzers

    [ https://issues.apache.org/jira/browse/JENA-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17519978#comment-17519978 ] 

ASF subversion and git services commented on JENA-2319:
-------------------------------------------------------

Commit 019db539c17b61705af4f9399ef40e9a0ee38ec6 in jena's branch refs/heads/main from der
[ https://gitbox.apache.org/repos/asf?p=jena.git;h=019db539c1 ]

JENA-2319: Fix concurrency issue with ConfigurableAnalyzer

Text query parsing when using a ConfigurableAnalyzer is not
thread safe. Underlying issue may be with Lucene analyzers since
ConfigurableAnalyzer itself looks clean.

Proposed fix is just a brute force synchronization, since without a
clearer chracterisation it is hard detect all possibly unsafe
analyzer configurations


> Concurrency errors in text search when using explicit Analyzers
> ---------------------------------------------------------------
>
>                 Key: JENA-2319
>                 URL: https://issues.apache.org/jira/browse/JENA-2319
>             Project: Apache Jena
>          Issue Type: Bug
>            Reporter: Dave Reynolds
>            Priority: Major
>
> Seeing errors when multiple jena text queries are in flight at the same time.  Precise traces vary but all examples seen so far occur in the Lucene analyzer phase of query parsing. Have only been able to reproduce this reliably when using the ConfigurableAnalyzer but that code itself looks clean suggesting that in general Lucene Analyzers are not thread safe. 
> Reproduced on Jena versions from 3.16.0 through 4.4.0.
> Will submit a PR with a test case and brute force fix (synchronize the query parse step) though more subtle fixes may be possible.
> Example partial stack traces:
> {{Caused by: java.lang.IllegalStateException: TokenStream contract violation: reset()/close() call missing, reset() called multiple times, or subclass does not call super.reset(). Please see Javadocs of TokenStream class for more information about the correct consuming workflow.}}
> {{    at org.apache.lucene.analysis.Tokenizer$1.read(Tokenizer.java:109) ~[fuseki-server.jar:4.4.0]}}
> {{    at org.apache.lucene.analysis.CharacterUtils.readFully(CharacterUtils.java:184) ~[fuseki-server.jar:4.4.0]}}
> {{    at org.apache.lucene.analysis.CharacterUtils.fill(CharacterUtils.java:160) ~[fuseki-server.jar:4.4.0]}}
> {{    at org.apache.lucene.analysis.CharacterUtils.fill(CharacterUtils.java:178) ~[fuseki-server.jar:4.4.0]}}
> {{    at org.apache.lucene.analysis.util.CharTokenizer.incrementToken(CharTokenizer.java:174) ~[fuseki-server.jar:4.4.0]}}
> {{    at org.apache.lucene.analysis.miscellaneous.ASCIIFoldingFilter.incrementToken(ASCIIFoldingFilter.java:102) ~[fuseki-server.jar:4.4.0]}}
> {{    at org.apache.lucene.analysis.LowerCaseFilter.incrementToken(LowerCaseFilter.java:41) ~[fuseki-server.jar:4.4.0]}}
> {{    at org.apache.lucene.analysis.CachingTokenFilter.fillCache(CachingTokenFilter.java:91) ~[fuseki-server.jar:4.4.0]}}
> {{    at org.apache.lucene.analysis.CachingTokenFilter.incrementToken(CachingTokenFilter.java:70) ~[fuseki-server.jar:4.4.0]}}
> {{    at org.apache.lucene.util.QueryBuilder.createFieldQuery(QueryBuilder.java:312) ~[fuseki-server.jar:4.4.0]}}
> {{    at org.apache.lucene.util.QueryBuilder.createFieldQuery(QueryBuilder.java:260) ~[fuseki-server.jar:4.4.0]}}
> {{    at org.apache.lucene.queryparser.classic.QueryParserBase.newFieldQuery(QueryParserBase.java:473) ~[fuseki-server.jar:4.4.0]}}
> {{    at org.apache.lucene.queryparser.classic.QueryParserBase.getFieldQuery(QueryParserBase.java:465) ~[fuseki-server.jar:4.4.0]}}
> {{    at org.apache.lucene.queryparser.classic.QueryParserBase.handleBareTokenQuery(QueryParserBase.java:828) ~[fuseki-server.jar:4.4.0]}}
> {{    at org.apache.lucene.queryparser.classic.QueryParser.Term(QueryParser.java:469) ~[fuseki-server.jar:4.4.0]}}
> {{    at org.apache.lucene.queryparser.classic.QueryParser.Clause(QueryParser.java:355) ~[fuseki-server.jar:4.4.0]}}
> {{    at org.apache.lucene.queryparser.classic.QueryParser.Query(QueryParser.java:244) ~[fuseki-server.jar:4.4.0]}}
> {{    at org.apache.lucene.queryparser.classic.QueryParser.TopLevelQuery(QueryParser.java:215) ~[fuseki-server.jar:4.4.0]}}
> {{    at org.apache.lucene.queryparser.classic.QueryParserBase.parse(QueryParserBase.java:109) ~[fuseki-server.jar:4.4.0]}}
> {{    at org.apache.jena.query.text.TextIndexLucene.parseQuery(TextIndexLucene.java:441) ~[fuseki-server.jar:4.4.0]}}
> {{...}}
> {{and}}
> {{Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 16 out of bounds for length 16}}
> {{    at org.apache.lucene.analysis.miscellaneous.ASCIIFoldingFilter.incrementToken(ASCIIFoldingFilter.java:109) ~[fuseki-server.jar:4.4.0]}}
> {{    at org.apache.lucene.analysis.LowerCaseFilter.incrementToken(LowerCaseFilter.java:41) ~[fuseki-server.jar:4.4.0]}}
> {{    at org.apache.lucene.analysis.Analyzer.normalize(Analyzer.java:247) ~[fuseki-server.jar:4.4.0]}}
> {{    at org.apache.lucene.queryparser.classic.QueryParserBase.getRegexpQuery(QueryParserBase.java:756) ~[fuseki-server.jar:4.4.0]}}
> {{    at org.apache.lucene.queryparser.classic.QueryParserBase.handleBareTokenQuery(QueryParserBase.java:824) ~[fuseki-server.jar:4.4.0]}}
> {{    at org.apache.lucene.queryparser.classic.QueryParser.Term(QueryParser.java:469) ~[fuseki-server.jar:4.4.0]}}
> {{    at org.apache.lucene.queryparser.classic.QueryParser.Clause(QueryParser.java:355) ~[fuseki-server.jar:4.4.0]}}
> {{    at org.apache.lucene.queryparser.classic.QueryParser.Query(QueryParser.java:244) ~[fuseki-server.jar:4.4.0]}}
> {{    at org.apache.lucene.queryparser.classic.QueryParser.TopLevelQuery(QueryParser.java:215) ~[fuseki-server.jar:4.4.0]}}
> {{    at org.apache.lucene.queryparser.classic.QueryParserBase.parse(QueryParserBase.java:109) ~[fuseki-server.jar:4.4.0]}}
> {{    at org.apache.jena.query.text.TextIndexLucene.parseQuery(TextIndexLucene.java:441) ~[fuseki-server.jar:4.4.0]}}
> {{...}}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: jira-unsubscribe@jena.apache.org
For additional commands, e-mail: jira-help@jena.apache.org