You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "Aliaksandr Autayeu (Created) (JIRA)" <ji...@apache.org> on 2011/11/10 17:14:52 UTC

[jira] [Created] (OPENNLP-368) loops improved in opennlp-tools

loops improved in opennlp-tools
-------------------------------

                 Key: OPENNLP-368
                 URL: https://issues.apache.org/jira/browse/OPENNLP-368
             Project: OpenNLP
          Issue Type: Improvement
    Affects Versions: tools-1.5.3-incubating
            Reporter: Aliaksandr Autayeu
            Priority: Minor
         Attachments: 0008-loops-improved-in-tools.patch

Many old-style indexed loops replaced with Java5 for each loops to improve code readability and reduce possibility of bugs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OPENNLP-368) loops improved in opennlp-tools

Posted by "Aliaksandr Autayeu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148415#comment-13148415 ] 

Aliaksandr Autayeu commented on OPENNLP-368:
--------------------------------------------

Exactly, to avoid getting something unexpected.

I am a little worried that the patch might introduce bugs, which could be
OK. Can you elaborate on "extensive testing". In short, how can I prove
it's OK to apply this simple refactoring?

In other parts we have good junit test coverage, there it would be safe to

 It would be good, if applied at least partially. The more accurate the
code is - the better.

Aliaksandr

                
> loops improved in opennlp-tools
> -------------------------------
>
>                 Key: OPENNLP-368
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-368
>             Project: OpenNLP
>          Issue Type: Improvement
>    Affects Versions: tools-1.5.3-incubating
>            Reporter: Aliaksandr Autayeu
>            Priority: Minor
>              Labels: patch
>         Attachments: 0008-loops-improved-in-tools.patch
>
>
> Many old-style indexed loops replaced with Java5 for each loops to improve code readability and reduce possibility of bugs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OPENNLP-368) loops improved in opennlp-tools

Posted by "Joern Kottmann (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148406#comment-13148406 ] 

Joern Kottmann commented on OPENNLP-368:
----------------------------------------

There is no need to use generics either, because we just need an Object and then call equals, client code also does not need to cast.
Don't see how generics could improve the code, the only thing you get is that the compiler can check that a client is always calling it with the same type, and nothing is passed in accidentally.

I am a little worried that the patch might introduce bugs, which could be quite hard to find later on. That is why we should do extensive testing before applying it. For some parts this is currently not possible e.g. coref.

In other parts we have good junit test coverage, there it would be safe to apply your changes.
                
> loops improved in opennlp-tools
> -------------------------------
>
>                 Key: OPENNLP-368
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-368
>             Project: OpenNLP
>          Issue Type: Improvement
>    Affects Versions: tools-1.5.3-incubating
>            Reporter: Aliaksandr Autayeu
>            Priority: Minor
>              Labels: patch
>         Attachments: 0008-loops-improved-in-tools.patch
>
>
> Many old-style indexed loops replaced with Java5 for each loops to improve code readability and reduce possibility of bugs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OPENNLP-368) loops improved in opennlp-tools

Posted by "Joern Kottmann (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148437#comment-13148437 ] 

Joern Kottmann commented on OPENNLP-368:
----------------------------------------

I will try to apply your patch partly and see what we can do. Maybe you can then update your patch afterward and maybe split it. At least to coref stuff should be applied after we have proper training and testing over there. 

                
> loops improved in opennlp-tools
> -------------------------------
>
>                 Key: OPENNLP-368
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-368
>             Project: OpenNLP
>          Issue Type: Improvement
>    Affects Versions: tools-1.5.3-incubating
>            Reporter: Aliaksandr Autayeu
>            Priority: Minor
>              Labels: patch
>         Attachments: 0008-loops-improved-in-tools.patch
>
>
> Many old-style indexed loops replaced with Java5 for each loops to improve code readability and reduce possibility of bugs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (OPENNLP-368) loops improved in opennlp-tools

Posted by "Aliaksandr Autayeu (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OPENNLP-368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aliaksandr Autayeu updated OPENNLP-368:
---------------------------------------

    Attachment: 0008-loops-improved-in-tools.patch
    
> loops improved in opennlp-tools
> -------------------------------
>
>                 Key: OPENNLP-368
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-368
>             Project: OpenNLP
>          Issue Type: Improvement
>    Affects Versions: tools-1.5.3-incubating
>            Reporter: Aliaksandr Autayeu
>            Priority: Minor
>              Labels: patch
>         Attachments: 0008-loops-improved-in-tools.patch
>
>
> Many old-style indexed loops replaced with Java5 for each loops to improve code readability and reduce possibility of bugs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OPENNLP-368) loops improved in opennlp-tools

Posted by "Aliaksandr Autayeu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148441#comment-13148441 ] 

Aliaksandr Autayeu commented on OPENNLP-368:
--------------------------------------------

OK. That'll be good. I agree that it is safer to change unit-test covered
code.

Aliaksandr

On Fri, Nov 11, 2011 at 12:52 PM, Joern Kottmann (Commented) (JIRA) <


                
> loops improved in opennlp-tools
> -------------------------------
>
>                 Key: OPENNLP-368
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-368
>             Project: OpenNLP
>          Issue Type: Improvement
>    Affects Versions: tools-1.5.3-incubating
>            Reporter: Aliaksandr Autayeu
>            Priority: Minor
>              Labels: patch
>         Attachments: 0008-loops-improved-in-tools.patch
>
>
> Many old-style indexed loops replaced with Java5 for each loops to improve code readability and reduce possibility of bugs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OPENNLP-368) loops improved in opennlp-tools

Posted by "Joern Kottmann (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13147802#comment-13147802 ] 

Joern Kottmann commented on OPENNLP-368:
----------------------------------------

I will review your patch. BTW, do you know why some patches get so big? Does that has to do with the end of line chars?
                
> loops improved in opennlp-tools
> -------------------------------
>
>                 Key: OPENNLP-368
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-368
>             Project: OpenNLP
>          Issue Type: Improvement
>    Affects Versions: tools-1.5.3-incubating
>            Reporter: Aliaksandr Autayeu
>            Priority: Minor
>              Labels: patch
>         Attachments: 0008-loops-improved-in-tools.patch
>
>
> Many old-style indexed loops replaced with Java5 for each loops to improve code readability and reduce possibility of bugs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OPENNLP-368) loops improved in opennlp-tools

Posted by "Aliaksandr Autayeu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148446#comment-13148446 ] 

Aliaksandr Autayeu commented on OPENNLP-368:
--------------------------------------------

I have a number of private datasets (17 and growing), ranging from 2K to
130K short labels (2-3 tokens on avg) like these

Cleaning_NN Equipment_NN and_CC Supplies_NNS
Commercial_JJ and_CC Military_JJ and_CC Private_JJ Vehicles_NNS and_CC
their_PP$ Accessories_NNS and_CC Components_NNS
Defense_NN and_CC Law_NN Enforcement_NN and_CC Security_NN and_CC Safety_NN
Equipment_NN and_CC Supplies_NNS
or these
groups_NNS of_IN animals_NNS
lower_JJR animals_NNS
mammals_NNS
landscapes_NNS with_IN waters_NNS ,_, waterscapes_NNS ,_, seascapes_NNS (_(
in_IN the_DT temperate_JJ zone_NN )_)

on which I regularly train and test tokenizer and POS tagger (have also
some NE-annotations, but currently do not work on them). Perhaps I can test
on these, if given proper instructions.

Aliaksandr


On Fri, Nov 11, 2011 at 12:02 PM, Joern Kottmann (Commented) (JIRA) <


                
> loops improved in opennlp-tools
> -------------------------------
>
>                 Key: OPENNLP-368
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-368
>             Project: OpenNLP
>          Issue Type: Improvement
>    Affects Versions: tools-1.5.3-incubating
>            Reporter: Aliaksandr Autayeu
>            Priority: Minor
>              Labels: patch
>         Attachments: 0008-loops-improved-in-tools.patch
>
>
> Many old-style indexed loops replaced with Java5 for each loops to improve code readability and reduce possibility of bugs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OPENNLP-368) loops improved in opennlp-tools

Posted by "Joern Kottmann (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148358#comment-13148358 ] 

Joern Kottmann commented on OPENNLP-368:
----------------------------------------

Reviewed parts of the patch. Looks good, but we need to have a good test coverage of the code we change.
Did you do any testing?

Do you have this in a git branch? So we could delay and change it easily?
                
> loops improved in opennlp-tools
> -------------------------------
>
>                 Key: OPENNLP-368
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-368
>             Project: OpenNLP
>          Issue Type: Improvement
>    Affects Versions: tools-1.5.3-incubating
>            Reporter: Aliaksandr Autayeu
>            Priority: Minor
>              Labels: patch
>         Attachments: 0008-loops-improved-in-tools.patch
>
>
> Many old-style indexed loops replaced with Java5 for each loops to improve code readability and reduce possibility of bugs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OPENNLP-368) loops improved in opennlp-tools

Posted by "Aliaksandr Autayeu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148397#comment-13148397 ] 

Aliaksandr Autayeu commented on OPENNLP-368:
--------------------------------------------

I usually run JUnit tests. Is there any other testing I should do?

Do you have this in a git branch? So we could delay and change it easily?
Yes. But why delaying? :)

Aliaksandr

                
> loops improved in opennlp-tools
> -------------------------------
>
>                 Key: OPENNLP-368
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-368
>             Project: OpenNLP
>          Issue Type: Improvement
>    Affects Versions: tools-1.5.3-incubating
>            Reporter: Aliaksandr Autayeu
>            Priority: Minor
>              Labels: patch
>         Attachments: 0008-loops-improved-in-tools.patch
>
>
> Many old-style indexed loops replaced with Java5 for each loops to improve code readability and reduce possibility of bugs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OPENNLP-368) loops improved in opennlp-tools

Posted by "Aliaksandr Autayeu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148395#comment-13148395 ] 

Aliaksandr Autayeu commented on OPENNLP-368:
--------------------------------------------

Can't this be generified then? I checked the code and didn't see
dependencies requiring Objects, that's why I did the change. When need
arise, it can be changed to generics. What do you think?

Aliaksandr

                
> loops improved in opennlp-tools
> -------------------------------
>
>                 Key: OPENNLP-368
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-368
>             Project: OpenNLP
>          Issue Type: Improvement
>    Affects Versions: tools-1.5.3-incubating
>            Reporter: Aliaksandr Autayeu
>            Priority: Minor
>              Labels: patch
>         Attachments: 0008-loops-improved-in-tools.patch
>
>
> Many old-style indexed loops replaced with Java5 for each loops to improve code readability and reduce possibility of bugs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OPENNLP-368) loops improved in opennlp-tools

Posted by "Joern Kottmann (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148362#comment-13148362 ] 

Joern Kottmann commented on OPENNLP-368:
----------------------------------------

In the FMeasure class this introduces a non-backward compatible change by by replacing params of the type Object with Span.
And it should stay Object, because the counting logic relies on equals and can also work with many types.
                
> loops improved in opennlp-tools
> -------------------------------
>
>                 Key: OPENNLP-368
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-368
>             Project: OpenNLP
>          Issue Type: Improvement
>    Affects Versions: tools-1.5.3-incubating
>            Reporter: Aliaksandr Autayeu
>            Priority: Minor
>              Labels: patch
>         Attachments: 0008-loops-improved-in-tools.patch
>
>
> Many old-style indexed loops replaced with Java5 for each loops to improve code readability and reduce possibility of bugs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OPENNLP-368) loops improved in opennlp-tools

Posted by "Joern Kottmann (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148456#comment-13148456 ] 

Joern Kottmann commented on OPENNLP-368:
----------------------------------------

There is some public data available where we have written parser for to convert them the OpenNLP format. Have a look at the documentation to get an overview.

I think we should move the parser and coref changes to two separate issues. The parser changes should be applied before the next release, but since the testing is kind of time intensive I might want to do this just before we release.
                
> loops improved in opennlp-tools
> -------------------------------
>
>                 Key: OPENNLP-368
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-368
>             Project: OpenNLP
>          Issue Type: Improvement
>    Affects Versions: tools-1.5.3-incubating
>            Reporter: Aliaksandr Autayeu
>            Priority: Minor
>              Labels: patch
>         Attachments: 0008-loops-improved-in-tools.patch
>
>
> Many old-style indexed loops replaced with Java5 for each loops to improve code readability and reduce possibility of bugs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OPENNLP-368) loops improved in opennlp-tools

Posted by "Joern Kottmann (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148422#comment-13148422 ] 

Joern Kottmann commented on OPENNLP-368:
----------------------------------------

Usually we train on a couple of public and private data sets to ensure it still works.
Have a look at out test plan for the 1.5.2 release:
https://cwiki.apache.org/OPENNLP/testplan152.html
                
> loops improved in opennlp-tools
> -------------------------------
>
>                 Key: OPENNLP-368
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-368
>             Project: OpenNLP
>          Issue Type: Improvement
>    Affects Versions: tools-1.5.3-incubating
>            Reporter: Aliaksandr Autayeu
>            Priority: Minor
>              Labels: patch
>         Attachments: 0008-loops-improved-in-tools.patch
>
>
> Many old-style indexed loops replaced with Java5 for each loops to improve code readability and reduce possibility of bugs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira