You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Karl Wettin (JIRA)" <ji...@apache.org> on 2009/01/09 16:30:59 UTC
[jira] Updated: (LUCENE-1514) ShingleMatrixFilter eaily throws
StackOverFlow as the complexity of a matrix grows
[ https://issues.apache.org/jira/browse/LUCENE-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Karl Wettin updated LUCENE-1514:
--------------------------------
Attachment: LUCENE-1514.txt
> ShingleMatrixFilter eaily throws StackOverFlow as the complexity of a matrix grows
> ----------------------------------------------------------------------------------
>
> Key: LUCENE-1514
> URL: https://issues.apache.org/jira/browse/LUCENE-1514
> Project: Lucene - Java
> Issue Type: Improvement
> Components: contrib/analyzers
> Affects Versions: 2.4
> Reporter: Karl Wettin
> Assignee: Karl Wettin
> Fix For: 2.9
>
> Attachments: LUCENE-1514.txt
>
>
> ShingleMatrixFilter#next makes a recursive function invocation when the current permutation iterator is exhausted or if the current state of the permutation iterator already has produced an identical shingle. In a not too complex matrix this will require a gigabyte sized stack per thread.
> My solution is to avoid the recursive invocation by refactoring like this:
> {code:java}
> public Token next(final Token reusableToken) throws IOException {
> assert reusableToken != null;
> if (matrix == null) {
> matrix = new Matrix();
> // fill matrix with maximumShingleSize columns
> while (matrix.columns.size() < maximumShingleSize && readColumn()) {
> // this loop looks ugly
> }
> }
> // this loop exists in order to avoid recursive calls to the next method
> // as the complexity of a large matrix
> // then would require a multi gigabyte sized stack.
> Token token;
> do {
> token = produceNextToken(reusableToken);
> } while (token == request_next_token);
> return token;
> }
>
> private static final Token request_next_token = new Token();
> /**
> * This method exists in order to avoid reursive calls to the method
> * as the complexity of a fairlt small matrix then easily would require
> * a gigabyte sized stack per thread.
> *
> * @param reusableToken
> * @return null if exhausted, instance request_next_token if one more call is required for an answer, or instance parameter resuableToken.
> * @throws IOException
> */
> private Token produceNextToken(final Token reusableToken) throws IOException {
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org