You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Karl Wettin (JIRA)" <ji...@apache.org> on 2009/01/09 16:38:59 UTC

[jira] Closed: (LUCENE-1514) ShingleMatrixFilter eaily throws StackOverFlow as the complexity of a matrix grows

     [ https://issues.apache.org/jira/browse/LUCENE-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karl Wettin closed LUCENE-1514.
-------------------------------

       Resolution: Fixed
    Lucene Fields: [New, Patch Available]  (was: [Patch Available, New])

Committed in revision 733064

> ShingleMatrixFilter eaily throws StackOverFlow as the complexity of a matrix grows
> ----------------------------------------------------------------------------------
>
>                 Key: LUCENE-1514
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1514
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/analyzers
>    Affects Versions: 2.4
>            Reporter: Karl Wettin
>            Assignee: Karl Wettin
>             Fix For: 2.9
>
>         Attachments: LUCENE-1514.txt
>
>
> ShingleMatrixFilter#next makes a recursive function invocation when the current permutation iterator is exhausted or if the current state of the permutation iterator already has produced an identical shingle. In a not too complex matrix this will require a gigabyte sized stack per thread.
> My solution is to avoid the recursive invocation by refactoring like this:
> {code:java}
> public Token next(final Token reusableToken) throws IOException {
>     assert reusableToken != null;
>     if (matrix == null) {
>       matrix = new Matrix();
>       // fill matrix with maximumShingleSize columns
>       while (matrix.columns.size() < maximumShingleSize && readColumn()) {
>         // this loop looks ugly
>       }
>     }
>     // this loop exists in order to avoid recursive calls to the next method
>     // as the complexity of a large matrix
>     // then would require a multi gigabyte sized stack.
>     Token token;
>     do {
>       token = produceNextToken(reusableToken);
>     } while (token == request_next_token);
>     return token;
>   }
>   
>   private static final Token request_next_token = new Token();
>   /**
>    * This method exists in order to avoid reursive calls to the method
>    * as the complexity of a fairlt small matrix then easily would require
>    * a gigabyte sized stack per thread.
>    *
>    * @param reusableToken
>    * @return null if exhausted, instance request_next_token if one more call is required for an answer, or instance parameter resuableToken.
>    * @throws IOException
>    */
>   private Token produceNextToken(final Token reusableToken) throws IOException {
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org