You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Robert Muir (JIRA)" <ji...@apache.org> on 2009/12/16 14:53:18 UTC

[jira] Created: (SOLR-1662) BufferedTokenStream incorrect cloning

BufferedTokenStream incorrect cloning
-------------------------------------

                 Key: SOLR-1662
                 URL: https://issues.apache.org/jira/browse/SOLR-1662
             Project: Solr
          Issue Type: Bug
          Components: Schema and Analysis
    Affects Versions: 1.4
            Reporter: Robert Muir


As part of writing tests for SOLR-1657, I rewrote one of the base classes (BaseTokenTestCase) to use the new TokenStream API, but also with some additional safety.
{code}
 public static String tsToString(TokenStream in) throws IOException {
    StringBuilder out = new StringBuilder();
    TermAttribute termAtt = (TermAttribute) in.addAttribute(TermAttribute.class);
    // extra safety to enforce, that the state is not preserved and also
    // assign bogus values
    in.clearAttributes();
    termAtt.setTermBuffer("bogusTerm");
    while (in.incrementToken()) {
      if (out.length() > 0)
        out.append(' ');
      out.append(termAtt.term());
      in.clearAttributes();
      termAtt.setTermBuffer("bogusTerm");
    }

    in.close();
    return out.toString();
  }
{code}

Setting the term text to bogus values helps find bugs in tokenstreams that do not clear or clone properly. In this case there is a problem with a tokenstream AB_AAB_Stream in TestBufferedTokenStream, it converts A B -> A A B but does not clone, so the values get overwritten.

This can be fixed in two ways: 
* BufferedTokenStream does the cloning
* subclasses are responsible for the cloning

The question is which one should it be?


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1662) BufferedTokenStream incorrect cloning

Posted by "Shalin Shekhar Mangar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12791866#action_12791866 ] 

Shalin Shekhar Mangar commented on SOLR-1662:
---------------------------------------------

{quote}
So if we decide its the responsibility of the subclass, these implementations need thorough tests to see if they are ok or not.
If we add the cloning to BufferedTokenStream itself, then we know they are ok...
{quote}

I think cloning should be done by sub-classes before writing. If BufferedTokenStream clones the token then every sub-class pays the price even though the use-case may just be to throw the token away.

> BufferedTokenStream incorrect cloning
> -------------------------------------
>
>                 Key: SOLR-1662
>                 URL: https://issues.apache.org/jira/browse/SOLR-1662
>             Project: Solr
>          Issue Type: Bug
>          Components: Schema and Analysis
>    Affects Versions: 1.4
>            Reporter: Robert Muir
>
> As part of writing tests for SOLR-1657, I rewrote one of the base classes (BaseTokenTestCase) to use the new TokenStream API, but also with some additional safety.
> {code}
>  public static String tsToString(TokenStream in) throws IOException {
>     StringBuilder out = new StringBuilder();
>     TermAttribute termAtt = (TermAttribute) in.addAttribute(TermAttribute.class);
>     // extra safety to enforce, that the state is not preserved and also
>     // assign bogus values
>     in.clearAttributes();
>     termAtt.setTermBuffer("bogusTerm");
>     while (in.incrementToken()) {
>       if (out.length() > 0)
>         out.append(' ');
>       out.append(termAtt.term());
>       in.clearAttributes();
>       termAtt.setTermBuffer("bogusTerm");
>     }
>     in.close();
>     return out.toString();
>   }
> {code}
> Setting the term text to bogus values helps find bugs in tokenstreams that do not clear or clone properly. In this case there is a problem with a tokenstream AB_AAB_Stream in TestBufferedTokenStream, it converts A B -> A A B but does not clone, so the values get overwritten.
> This can be fixed in two ways: 
> * BufferedTokenStream does the cloning
> * subclasses are responsible for the cloning
> The question is which one should it be?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (SOLR-1662) BufferedTokenStream incorrect cloning

Posted by "Shalin Shekhar Mangar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shalin Shekhar Mangar resolved SOLR-1662.
-----------------------------------------

    Resolution: Fixed

Committed revision 891889.

Thanks Robert and Uwe!

> BufferedTokenStream incorrect cloning
> -------------------------------------
>
>                 Key: SOLR-1662
>                 URL: https://issues.apache.org/jira/browse/SOLR-1662
>             Project: Solr
>          Issue Type: Bug
>          Components: Schema and Analysis
>    Affects Versions: 1.4
>            Reporter: Robert Muir
>            Assignee: Shalin Shekhar Mangar
>         Attachments: SOLR-1662.patch
>
>
> As part of writing tests for SOLR-1657, I rewrote one of the base classes (BaseTokenTestCase) to use the new TokenStream API, but also with some additional safety.
> {code}
>  public static String tsToString(TokenStream in) throws IOException {
>     StringBuilder out = new StringBuilder();
>     TermAttribute termAtt = (TermAttribute) in.addAttribute(TermAttribute.class);
>     // extra safety to enforce, that the state is not preserved and also
>     // assign bogus values
>     in.clearAttributes();
>     termAtt.setTermBuffer("bogusTerm");
>     while (in.incrementToken()) {
>       if (out.length() > 0)
>         out.append(' ');
>       out.append(termAtt.term());
>       in.clearAttributes();
>       termAtt.setTermBuffer("bogusTerm");
>     }
>     in.close();
>     return out.toString();
>   }
> {code}
> Setting the term text to bogus values helps find bugs in tokenstreams that do not clear or clone properly. In this case there is a problem with a tokenstream AB_AAB_Stream in TestBufferedTokenStream, it converts A B -> A A B but does not clone, so the values get overwritten.
> This can be fixed in two ways: 
> * BufferedTokenStream does the cloning
> * subclasses are responsible for the cloning
> The question is which one should it be?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1662) BufferedTokenStream incorrect cloning

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated SOLR-1662:
------------------------------

    Attachment: SOLR-1662.patch

attached is patch to fix the javadocs and test (this same test is also an example in the javadocs header...)

> BufferedTokenStream incorrect cloning
> -------------------------------------
>
>                 Key: SOLR-1662
>                 URL: https://issues.apache.org/jira/browse/SOLR-1662
>             Project: Solr
>          Issue Type: Bug
>          Components: Schema and Analysis
>    Affects Versions: 1.4
>            Reporter: Robert Muir
>            Assignee: Shalin Shekhar Mangar
>         Attachments: SOLR-1662.patch
>
>
> As part of writing tests for SOLR-1657, I rewrote one of the base classes (BaseTokenTestCase) to use the new TokenStream API, but also with some additional safety.
> {code}
>  public static String tsToString(TokenStream in) throws IOException {
>     StringBuilder out = new StringBuilder();
>     TermAttribute termAtt = (TermAttribute) in.addAttribute(TermAttribute.class);
>     // extra safety to enforce, that the state is not preserved and also
>     // assign bogus values
>     in.clearAttributes();
>     termAtt.setTermBuffer("bogusTerm");
>     while (in.incrementToken()) {
>       if (out.length() > 0)
>         out.append(' ');
>       out.append(termAtt.term());
>       in.clearAttributes();
>       termAtt.setTermBuffer("bogusTerm");
>     }
>     in.close();
>     return out.toString();
>   }
> {code}
> Setting the term text to bogus values helps find bugs in tokenstreams that do not clear or clone properly. In this case there is a problem with a tokenstream AB_AAB_Stream in TestBufferedTokenStream, it converts A B -> A A B but does not clone, so the values get overwritten.
> This can be fixed in two ways: 
> * BufferedTokenStream does the cloning
> * subclasses are responsible for the cloning
> The question is which one should it be?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1662) BufferedTokenStream incorrect cloning

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12791940#action_12791940 ] 

Uwe Schindler commented on SOLR-1662:
-------------------------------------

+1 Looks good!

> BufferedTokenStream incorrect cloning
> -------------------------------------
>
>                 Key: SOLR-1662
>                 URL: https://issues.apache.org/jira/browse/SOLR-1662
>             Project: Solr
>          Issue Type: Bug
>          Components: Schema and Analysis
>    Affects Versions: 1.4
>            Reporter: Robert Muir
>            Assignee: Shalin Shekhar Mangar
>         Attachments: SOLR-1662.patch
>
>
> As part of writing tests for SOLR-1657, I rewrote one of the base classes (BaseTokenTestCase) to use the new TokenStream API, but also with some additional safety.
> {code}
>  public static String tsToString(TokenStream in) throws IOException {
>     StringBuilder out = new StringBuilder();
>     TermAttribute termAtt = (TermAttribute) in.addAttribute(TermAttribute.class);
>     // extra safety to enforce, that the state is not preserved and also
>     // assign bogus values
>     in.clearAttributes();
>     termAtt.setTermBuffer("bogusTerm");
>     while (in.incrementToken()) {
>       if (out.length() > 0)
>         out.append(' ');
>       out.append(termAtt.term());
>       in.clearAttributes();
>       termAtt.setTermBuffer("bogusTerm");
>     }
>     in.close();
>     return out.toString();
>   }
> {code}
> Setting the term text to bogus values helps find bugs in tokenstreams that do not clear or clone properly. In this case there is a problem with a tokenstream AB_AAB_Stream in TestBufferedTokenStream, it converts A B -> A A B but does not clone, so the values get overwritten.
> This can be fixed in two ways: 
> * BufferedTokenStream does the cloning
> * subclasses are responsible for the cloning
> The question is which one should it be?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (SOLR-1662) BufferedTokenStream incorrect cloning

Posted by "Shalin Shekhar Mangar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shalin Shekhar Mangar reassigned SOLR-1662:
-------------------------------------------

    Assignee: Shalin Shekhar Mangar

> BufferedTokenStream incorrect cloning
> -------------------------------------
>
>                 Key: SOLR-1662
>                 URL: https://issues.apache.org/jira/browse/SOLR-1662
>             Project: Solr
>          Issue Type: Bug
>          Components: Schema and Analysis
>    Affects Versions: 1.4
>            Reporter: Robert Muir
>            Assignee: Shalin Shekhar Mangar
>
> As part of writing tests for SOLR-1657, I rewrote one of the base classes (BaseTokenTestCase) to use the new TokenStream API, but also with some additional safety.
> {code}
>  public static String tsToString(TokenStream in) throws IOException {
>     StringBuilder out = new StringBuilder();
>     TermAttribute termAtt = (TermAttribute) in.addAttribute(TermAttribute.class);
>     // extra safety to enforce, that the state is not preserved and also
>     // assign bogus values
>     in.clearAttributes();
>     termAtt.setTermBuffer("bogusTerm");
>     while (in.incrementToken()) {
>       if (out.length() > 0)
>         out.append(' ');
>       out.append(termAtt.term());
>       in.clearAttributes();
>       termAtt.setTermBuffer("bogusTerm");
>     }
>     in.close();
>     return out.toString();
>   }
> {code}
> Setting the term text to bogus values helps find bugs in tokenstreams that do not clear or clone properly. In this case there is a problem with a tokenstream AB_AAB_Stream in TestBufferedTokenStream, it converts A B -> A A B but does not clone, so the values get overwritten.
> This can be fixed in two ways: 
> * BufferedTokenStream does the cloning
> * subclasses are responsible for the cloning
> The question is which one should it be?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1662) BufferedTokenStream incorrect cloning

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12791888#action_12791888 ] 

Robert Muir commented on SOLR-1662:
-----------------------------------

cool, will work up a patch with javadocs wordage (I think we need this at least, its not obvious and there is no mention of it), and any fixes.

> BufferedTokenStream incorrect cloning
> -------------------------------------
>
>                 Key: SOLR-1662
>                 URL: https://issues.apache.org/jira/browse/SOLR-1662
>             Project: Solr
>          Issue Type: Bug
>          Components: Schema and Analysis
>    Affects Versions: 1.4
>            Reporter: Robert Muir
>
> As part of writing tests for SOLR-1657, I rewrote one of the base classes (BaseTokenTestCase) to use the new TokenStream API, but also with some additional safety.
> {code}
>  public static String tsToString(TokenStream in) throws IOException {
>     StringBuilder out = new StringBuilder();
>     TermAttribute termAtt = (TermAttribute) in.addAttribute(TermAttribute.class);
>     // extra safety to enforce, that the state is not preserved and also
>     // assign bogus values
>     in.clearAttributes();
>     termAtt.setTermBuffer("bogusTerm");
>     while (in.incrementToken()) {
>       if (out.length() > 0)
>         out.append(' ');
>       out.append(termAtt.term());
>       in.clearAttributes();
>       termAtt.setTermBuffer("bogusTerm");
>     }
>     in.close();
>     return out.toString();
>   }
> {code}
> Setting the term text to bogus values helps find bugs in tokenstreams that do not clear or clone properly. In this case there is a problem with a tokenstream AB_AAB_Stream in TestBufferedTokenStream, it converts A B -> A A B but does not clone, so the values get overwritten.
> This can be fixed in two ways: 
> * BufferedTokenStream does the cloning
> * subclasses are responsible for the cloning
> The question is which one should it be?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1662) BufferedTokenStream incorrect cloning

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12791370#action_12791370 ] 

Uwe Schindler commented on SOLR-1662:
-------------------------------------

Just the short desription from the API side in Lucene:
Lucene's documentation of TokenStream.next() says: "The returned Token is a "full private copy" (not re-used across calls to next())". AB_AAB_Stream.process() duplicates the token by just putting it uncloned into the outQueue. As the consumer of the BufferedTokenStream assumes that the Token is private it is allowed to change it - and by that it also changes the token in the outQueue. If you e.g. put another TokenFilter in fromt of this AB_AAB_Stream, and modify the token there it would break.
In my opinion, the responsibility to clone is in AB_AAB_Stream, BufferedTokenStream will never return the same token twice by itsself. So its a bug in the test. But Robert told me that e.g. RemoveDuplicates has a similar problem.
The general contract for writing such streams is: whenever you return a Token from next(), never put it somewhere else uncloned, because the caller can change it.

The fix is to do: write((Token) t.clone());

> BufferedTokenStream incorrect cloning
> -------------------------------------
>
>                 Key: SOLR-1662
>                 URL: https://issues.apache.org/jira/browse/SOLR-1662
>             Project: Solr
>          Issue Type: Bug
>          Components: Schema and Analysis
>    Affects Versions: 1.4
>            Reporter: Robert Muir
>
> As part of writing tests for SOLR-1657, I rewrote one of the base classes (BaseTokenTestCase) to use the new TokenStream API, but also with some additional safety.
> {code}
>  public static String tsToString(TokenStream in) throws IOException {
>     StringBuilder out = new StringBuilder();
>     TermAttribute termAtt = (TermAttribute) in.addAttribute(TermAttribute.class);
>     // extra safety to enforce, that the state is not preserved and also
>     // assign bogus values
>     in.clearAttributes();
>     termAtt.setTermBuffer("bogusTerm");
>     while (in.incrementToken()) {
>       if (out.length() > 0)
>         out.append(' ');
>       out.append(termAtt.term());
>       in.clearAttributes();
>       termAtt.setTermBuffer("bogusTerm");
>     }
>     in.close();
>     return out.toString();
>   }
> {code}
> Setting the term text to bogus values helps find bugs in tokenstreams that do not clear or clone properly. In this case there is a problem with a tokenstream AB_AAB_Stream in TestBufferedTokenStream, it converts A B -> A A B but does not clone, so the values get overwritten.
> This can be fixed in two ways: 
> * BufferedTokenStream does the cloning
> * subclasses are responsible for the cloning
> The question is which one should it be?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1662) BufferedTokenStream incorrect cloning

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12791867#action_12791867 ] 

Uwe Schindler commented on SOLR-1662:
-------------------------------------

bq. I think cloning should be done by sub-classes before writing. If BufferedTokenStream clones the token then every sub-class pays the price even though the use-case may just be to throw the token away.

+1, that was what i said in my first comment, too. Because BufferedTokenStream itsself never reuses the token. The problem is the test and RemoveDuplicates, that push the same instance twice into the queue.

> BufferedTokenStream incorrect cloning
> -------------------------------------
>
>                 Key: SOLR-1662
>                 URL: https://issues.apache.org/jira/browse/SOLR-1662
>             Project: Solr
>          Issue Type: Bug
>          Components: Schema and Analysis
>    Affects Versions: 1.4
>            Reporter: Robert Muir
>
> As part of writing tests for SOLR-1657, I rewrote one of the base classes (BaseTokenTestCase) to use the new TokenStream API, but also with some additional safety.
> {code}
>  public static String tsToString(TokenStream in) throws IOException {
>     StringBuilder out = new StringBuilder();
>     TermAttribute termAtt = (TermAttribute) in.addAttribute(TermAttribute.class);
>     // extra safety to enforce, that the state is not preserved and also
>     // assign bogus values
>     in.clearAttributes();
>     termAtt.setTermBuffer("bogusTerm");
>     while (in.incrementToken()) {
>       if (out.length() > 0)
>         out.append(' ');
>       out.append(termAtt.term());
>       in.clearAttributes();
>       termAtt.setTermBuffer("bogusTerm");
>     }
>     in.close();
>     return out.toString();
>   }
> {code}
> Setting the term text to bogus values helps find bugs in tokenstreams that do not clear or clone properly. In this case there is a problem with a tokenstream AB_AAB_Stream in TestBufferedTokenStream, it converts A B -> A A B but does not clone, so the values get overwritten.
> This can be fixed in two ways: 
> * BufferedTokenStream does the cloning
> * subclasses are responsible for the cloning
> The question is which one should it be?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (SOLR-1662) BufferedTokenStream incorrect cloning

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12791910#action_12791910 ] 

Robert Muir edited comment on SOLR-1662 at 12/17/09 1:09 PM:
-------------------------------------------------------------

attached is patch to fix the javadocs and test (this same test is also an example in the javadocs header...)

edit: also btw i finished converting the tests to assertTokenStreamContents for the other BufferedTokenStreams, there are no problems in DuplicateFilter or CommonGrams, so this should be all we need.


      was (Author: rcmuir):
    attached is patch to fix the javadocs and test (this same test is also an example in the javadocs header...)
  
> BufferedTokenStream incorrect cloning
> -------------------------------------
>
>                 Key: SOLR-1662
>                 URL: https://issues.apache.org/jira/browse/SOLR-1662
>             Project: Solr
>          Issue Type: Bug
>          Components: Schema and Analysis
>    Affects Versions: 1.4
>            Reporter: Robert Muir
>            Assignee: Shalin Shekhar Mangar
>         Attachments: SOLR-1662.patch
>
>
> As part of writing tests for SOLR-1657, I rewrote one of the base classes (BaseTokenTestCase) to use the new TokenStream API, but also with some additional safety.
> {code}
>  public static String tsToString(TokenStream in) throws IOException {
>     StringBuilder out = new StringBuilder();
>     TermAttribute termAtt = (TermAttribute) in.addAttribute(TermAttribute.class);
>     // extra safety to enforce, that the state is not preserved and also
>     // assign bogus values
>     in.clearAttributes();
>     termAtt.setTermBuffer("bogusTerm");
>     while (in.incrementToken()) {
>       if (out.length() > 0)
>         out.append(' ');
>       out.append(termAtt.term());
>       in.clearAttributes();
>       termAtt.setTermBuffer("bogusTerm");
>     }
>     in.close();
>     return out.toString();
>   }
> {code}
> Setting the term text to bogus values helps find bugs in tokenstreams that do not clear or clone properly. In this case there is a problem with a tokenstream AB_AAB_Stream in TestBufferedTokenStream, it converts A B -> A A B but does not clone, so the values get overwritten.
> This can be fixed in two ways: 
> * BufferedTokenStream does the cloning
> * subclasses are responsible for the cloning
> The question is which one should it be?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1662) BufferedTokenStream incorrect cloning

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12791374#action_12791374 ] 

Robert Muir commented on SOLR-1662:
-----------------------------------

bq. but Robert told me that e.g. RemoveDuplicates has a similar problem.

Right, there is no cloning in RemoveDuplicates. CommonGrams creates a new Token() when it grams, but its not clear that this one is correct either.

So if we decide its the responsibility of the subclass, these implementations need thorough tests to see if they are ok or not.
If we add the cloning to BufferedTokenStream itself, then we know they are ok... 


> BufferedTokenStream incorrect cloning
> -------------------------------------
>
>                 Key: SOLR-1662
>                 URL: https://issues.apache.org/jira/browse/SOLR-1662
>             Project: Solr
>          Issue Type: Bug
>          Components: Schema and Analysis
>    Affects Versions: 1.4
>            Reporter: Robert Muir
>
> As part of writing tests for SOLR-1657, I rewrote one of the base classes (BaseTokenTestCase) to use the new TokenStream API, but also with some additional safety.
> {code}
>  public static String tsToString(TokenStream in) throws IOException {
>     StringBuilder out = new StringBuilder();
>     TermAttribute termAtt = (TermAttribute) in.addAttribute(TermAttribute.class);
>     // extra safety to enforce, that the state is not preserved and also
>     // assign bogus values
>     in.clearAttributes();
>     termAtt.setTermBuffer("bogusTerm");
>     while (in.incrementToken()) {
>       if (out.length() > 0)
>         out.append(' ');
>       out.append(termAtt.term());
>       in.clearAttributes();
>       termAtt.setTermBuffer("bogusTerm");
>     }
>     in.close();
>     return out.toString();
>   }
> {code}
> Setting the term text to bogus values helps find bugs in tokenstreams that do not clear or clone properly. In this case there is a problem with a tokenstream AB_AAB_Stream in TestBufferedTokenStream, it converts A B -> A A B but does not clone, so the values get overwritten.
> This can be fixed in two ways: 
> * BufferedTokenStream does the cloning
> * subclasses are responsible for the cloning
> The question is which one should it be?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.