You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "Jim Piliouras (Created) (JIRA)" <ji...@apache.org> on 2012/04/11 15:47:18 UTC

[jira] [Created] (OPENNLP-495) DictionaryNameFinder only outputs Spans of type "default"

DictionaryNameFinder only outputs Spans of type "default"
---------------------------------------------------------

                 Key: OPENNLP-495
                 URL: https://issues.apache.org/jira/browse/OPENNLP-495
             Project: OpenNLP
          Issue Type: Improvement
          Components: Name Finder
    Affects Versions: tools-1.5.3
         Environment: Ubuntu x64 Java 7 update 3
            Reporter: Jim Piliouras
             Fix For: tools-1.5.3


The DictionaryNameFinder always creates prediction Spans of type: default. Since we want to start merging results from several name-finders it makes sense to have them all output the same tag, otherwise it is impossible to properly merge the results. I mean they will be merged but half will be with default tag and some with the other user-specified tag. They can't be evaluated like that...they have to be consistent. That is very easy to fix...All i did was to create a global, mutable, String field and I am checking whether it is null before creating the Span. If it is then the usual happens (you get the default tag), if it isn't however the Span is created with whatever tag the user has supplied. In other words, a simple setter method can be used to set what tag to use in the DictionaryNameFinder Object. Of course this only works for single-type entities but then again dictionaries tend to be single-type "repositories". I am confident that you can commit this soon...all i did was add a field, a setter method for that field and an 'if statement' before creating the Span. What can possibly go wrong?
 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Updated] (OPENNLP-495) DictionaryNameFinder only outputs Spans of type "default"

Posted by "Jim Piliouras (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OPENNLP-495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Piliouras updated OPENNLP-495:
----------------------------------

    Attachment: OPENNLP-495-4.patch

I ended up checking out the latest revision from svn...
Formatted the code using the eclipse formatter xml file found on the website and took the diff...

Should be ok to apply now...
                
> DictionaryNameFinder only outputs Spans of type "default"
> ---------------------------------------------------------
>
>                 Key: OPENNLP-495
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-495
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Name Finder
>    Affects Versions: tools-1.5.3
>         Environment: Ubuntu x64 Java 7 update 3
>            Reporter: Jim Piliouras
>              Labels: patch
>             Fix For: tools-1.5.3
>
>         Attachments: DictionaryNameFinder.java, OPENNLP-495-2.patch, OPENNLP-495-3.patch, OPENNLP-495-4.patch, OPENNLP-495.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The DictionaryNameFinder always creates prediction Spans of type: default. Since we want to start merging results from several name-finders it makes sense to have them all output the same tag, otherwise it is impossible to properly merge the results. I mean they will be merged but half will be with default tag and some with the other user-specified tag. They can't be evaluated like that...they have to be consistent. That is very easy to fix...All i did was to create a global, mutable, String field and I am checking whether it is null before creating the Span. If it is then the usual happens (you get the default tag), if it isn't however the Span is created with whatever tag the user has supplied. In other words, a simple setter method can be used to set what tag to use in the DictionaryNameFinder Object. Of course this only works for single-type entities but then again dictionaries tend to be single-type "repositories". I am confident that you can commit this soon...all i did was add a field, a setter method for that field and an 'if statement' before creating the Span. What can possibly go wrong?
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Resolved] (OPENNLP-495) DictionaryNameFinder only outputs Spans of type "default"

Posted by "Joern Kottmann (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OPENNLP-495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joern Kottmann resolved OPENNLP-495.
------------------------------------

    Resolution: Fixed
      Assignee: Joern Kottmann
    
> DictionaryNameFinder only outputs Spans of type "default"
> ---------------------------------------------------------
>
>                 Key: OPENNLP-495
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-495
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Name Finder
>    Affects Versions: tools-1.5.3
>         Environment: Ubuntu x64 Java 7 update 3
>            Reporter: Jim Piliouras
>            Assignee: Joern Kottmann
>              Labels: patch
>             Fix For: tools-1.5.3
>
>         Attachments: DictionaryNameFinder.java, OPENNLP-495-2.patch, OPENNLP-495-3.patch, OPENNLP-495-4.patch, OPENNLP-495-5.patch, OPENNLP-495-6.patch, OPENNLP-495.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The DictionaryNameFinder always creates prediction Spans of type: default. Since we want to start merging results from several name-finders it makes sense to have them all output the same tag, otherwise it is impossible to properly merge the results. I mean they will be merged but half will be with default tag and some with the other user-specified tag. They can't be evaluated like that...they have to be consistent. That is very easy to fix...All i did was to create a global, mutable, String field and I am checking whether it is null before creating the Span. If it is then the usual happens (you get the default tag), if it isn't however the Span is created with whatever tag the user has supplied. In other words, a simple setter method can be used to set what tag to use in the DictionaryNameFinder Object. Of course this only works for single-type entities but then again dictionaries tend to be single-type "repositories". I am confident that you can commit this soon...all i did was add a field, a setter method for that field and an 'if statement' before creating the Span. What can possibly go wrong?
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (OPENNLP-495) DictionaryNameFinder only outputs Spans of type "default"

Posted by "Joern Kottmann (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252336#comment-13252336 ] 

Joern Kottmann commented on OPENNLP-495:
----------------------------------------

Sure, you would of course just create a second constructor.

I don't get it. What is bad about checking the type variable against null? Its common practice in Java. Using a boolean to avoid a null check seems to easily cause trouble when the two variables get out of sync. I doubt that it is faster, and even if it is a little bit faster its usually a bad idea to write "optimized" code which is harder to maintain. 

Anyway, lets add a second constructor and then always assign a value to the type field.

                
> DictionaryNameFinder only outputs Spans of type "default"
> ---------------------------------------------------------
>
>                 Key: OPENNLP-495
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-495
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Name Finder
>    Affects Versions: tools-1.5.3
>         Environment: Ubuntu x64 Java 7 update 3
>            Reporter: Jim Piliouras
>              Labels: patch
>             Fix For: tools-1.5.3
>
>         Attachments: DictionaryNameFinder.java, OPENNLP-495-2.patch, OPENNLP-495.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The DictionaryNameFinder always creates prediction Spans of type: default. Since we want to start merging results from several name-finders it makes sense to have them all output the same tag, otherwise it is impossible to properly merge the results. I mean they will be merged but half will be with default tag and some with the other user-specified tag. They can't be evaluated like that...they have to be consistent. That is very easy to fix...All i did was to create a global, mutable, String field and I am checking whether it is null before creating the Span. If it is then the usual happens (you get the default tag), if it isn't however the Span is created with whatever tag the user has supplied. In other words, a simple setter method can be used to set what tag to use in the DictionaryNameFinder Object. Of course this only works for single-type entities but then again dictionaries tend to be single-type "repositories". I am confident that you can commit this soon...all i did was add a field, a setter method for that field and an 'if statement' before creating the Span. What can possibly go wrong?
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (OPENNLP-495) DictionaryNameFinder only outputs Spans of type "default"

Posted by "Jim Piliouras (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252331#comment-13252331 ] 

Jim Piliouras commented on OPENNLP-495:
---------------------------------------

Would you prefer the 2nd constructor instead of the setter method? It is a 2 minute change...if yes - let me know please.
                
> DictionaryNameFinder only outputs Spans of type "default"
> ---------------------------------------------------------
>
>                 Key: OPENNLP-495
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-495
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Name Finder
>    Affects Versions: tools-1.5.3
>         Environment: Ubuntu x64 Java 7 update 3
>            Reporter: Jim Piliouras
>              Labels: patch
>             Fix For: tools-1.5.3
>
>         Attachments: DictionaryNameFinder.java, OPENNLP-495-2.patch, OPENNLP-495.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The DictionaryNameFinder always creates prediction Spans of type: default. Since we want to start merging results from several name-finders it makes sense to have them all output the same tag, otherwise it is impossible to properly merge the results. I mean they will be merged but half will be with default tag and some with the other user-specified tag. They can't be evaluated like that...they have to be consistent. That is very easy to fix...All i did was to create a global, mutable, String field and I am checking whether it is null before creating the Span. If it is then the usual happens (you get the default tag), if it isn't however the Span is created with whatever tag the user has supplied. In other words, a simple setter method can be used to set what tag to use in the DictionaryNameFinder Object. Of course this only works for single-type entities but then again dictionaries tend to be single-type "repositories". I am confident that you can commit this soon...all i did was add a field, a setter method for that field and an 'if statement' before creating the Span. What can possibly go wrong?
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (OPENNLP-495) DictionaryNameFinder only outputs Spans of type "default"

Posted by "Jim Piliouras (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252462#comment-13252462 ] 

Jim Piliouras commented on OPENNLP-495:
---------------------------------------

As a reminder, i shall mention that with this sorted, we are a tiny step away from being able to merge the results from different name-finders properly and that is the array-store exception i'm getting when copying the final merged arrayList (parametrised with Span) into an array using list.toArray(new Span[list.size()])...If you think you have time to have a look at the AggragateNameFinder.java at some point that would be fantastic. It is so frustrating because the exception occurs at the very last line of the method. I've done debugging to see whether the merging actually happens and it works. It's just when trying to return the Span[] that i get error. In other words if we sort the exception it will work immediately which is very exciting...
                
> DictionaryNameFinder only outputs Spans of type "default"
> ---------------------------------------------------------
>
>                 Key: OPENNLP-495
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-495
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Name Finder
>    Affects Versions: tools-1.5.3
>         Environment: Ubuntu x64 Java 7 update 3
>            Reporter: Jim Piliouras
>            Assignee: Joern Kottmann
>              Labels: patch
>             Fix For: tools-1.5.3
>
>         Attachments: OPENNLP-495-2.patch, OPENNLP-495-3.patch, OPENNLP-495-4.patch, OPENNLP-495-5.patch, OPENNLP-495-6.patch, OPENNLP-495.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The DictionaryNameFinder always creates prediction Spans of type: default. Since we want to start merging results from several name-finders it makes sense to have them all output the same tag, otherwise it is impossible to properly merge the results. I mean they will be merged but half will be with default tag and some with the other user-specified tag. They can't be evaluated like that...they have to be consistent. That is very easy to fix...All i did was to create a global, mutable, String field and I am checking whether it is null before creating the Span. If it is then the usual happens (you get the default tag), if it isn't however the Span is created with whatever tag the user has supplied. In other words, a simple setter method can be used to set what tag to use in the DictionaryNameFinder Object. Of course this only works for single-type entities but then again dictionaries tend to be single-type "repositories". I am confident that you can commit this soon...all i did was add a field, a setter method for that field and an 'if statement' before creating the Span. What can possibly go wrong?
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (OPENNLP-495) DictionaryNameFinder only outputs Spans of type "default"

Posted by "Joern Kottmann (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252415#comment-13252415 ] 

Joern Kottmann commented on OPENNLP-495:
----------------------------------------

We have two constructors. One is the old one, and one is a new one where a user can pass in a type. If the old one is used you just assign "default" to the new type field. If the new one is used. You should ensure that we don't get null (if so throw an IllegalArgumentException) and then assign the the type to the type field.

In the find method you now use the type field instead of the DEFAULT_TYPE constant.

Have a look at your patch file. There you only need see the following things:
- It adds the type field
- It adds your new constructor
- Modification to the existing constructor (should call the new one)
- Modification to the find method (one line changed)
                
> DictionaryNameFinder only outputs Spans of type "default"
> ---------------------------------------------------------
>
>                 Key: OPENNLP-495
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-495
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Name Finder
>    Affects Versions: tools-1.5.3
>         Environment: Ubuntu x64 Java 7 update 3
>            Reporter: Jim Piliouras
>              Labels: patch
>             Fix For: tools-1.5.3
>
>         Attachments: DictionaryNameFinder.java, OPENNLP-495-2.patch, OPENNLP-495-3.patch, OPENNLP-495-4.patch, OPENNLP-495-5.patch, OPENNLP-495.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The DictionaryNameFinder always creates prediction Spans of type: default. Since we want to start merging results from several name-finders it makes sense to have them all output the same tag, otherwise it is impossible to properly merge the results. I mean they will be merged but half will be with default tag and some with the other user-specified tag. They can't be evaluated like that...they have to be consistent. That is very easy to fix...All i did was to create a global, mutable, String field and I am checking whether it is null before creating the Span. If it is then the usual happens (you get the default tag), if it isn't however the Span is created with whatever tag the user has supplied. In other words, a simple setter method can be used to set what tag to use in the DictionaryNameFinder Object. Of course this only works for single-type entities but then again dictionaries tend to be single-type "repositories". I am confident that you can commit this soon...all i did was add a field, a setter method for that field and an 'if statement' before creating the Span. What can possibly go wrong?
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Updated] (OPENNLP-495) DictionaryNameFinder only outputs Spans of type "default"

Posted by "Jim Piliouras (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OPENNLP-495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Piliouras updated OPENNLP-495:
----------------------------------

    Attachment: patch.diff

That is the diff file showing the differences between the old version and the one i patched. Sorry i didn't realize i had to do this...I thought submitting the whole class was fine...
                
> DictionaryNameFinder only outputs Spans of type "default"
> ---------------------------------------------------------
>
>                 Key: OPENNLP-495
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-495
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Name Finder
>    Affects Versions: tools-1.5.3
>         Environment: Ubuntu x64 Java 7 update 3
>            Reporter: Jim Piliouras
>              Labels: patch
>             Fix For: tools-1.5.3
>
>         Attachments: DictionaryNameFinder.java
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The DictionaryNameFinder always creates prediction Spans of type: default. Since we want to start merging results from several name-finders it makes sense to have them all output the same tag, otherwise it is impossible to properly merge the results. I mean they will be merged but half will be with default tag and some with the other user-specified tag. They can't be evaluated like that...they have to be consistent. That is very easy to fix...All i did was to create a global, mutable, String field and I am checking whether it is null before creating the Span. If it is then the usual happens (you get the default tag), if it isn't however the Span is created with whatever tag the user has supplied. In other words, a simple setter method can be used to set what tag to use in the DictionaryNameFinder Object. Of course this only works for single-type entities but then again dictionaries tend to be single-type "repositories". I am confident that you can commit this soon...all i did was add a field, a setter method for that field and an 'if statement' before creating the Span. What can possibly go wrong?
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Updated] (OPENNLP-495) DictionaryNameFinder only outputs Spans of type "default"

Posted by "Jim Piliouras (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OPENNLP-495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Piliouras updated OPENNLP-495:
----------------------------------

    Attachment:     (was: DictionaryNameFinder.java)
    
> DictionaryNameFinder only outputs Spans of type "default"
> ---------------------------------------------------------
>
>                 Key: OPENNLP-495
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-495
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Name Finder
>    Affects Versions: tools-1.5.3
>         Environment: Ubuntu x64 Java 7 update 3
>            Reporter: Jim Piliouras
>            Assignee: Joern Kottmann
>              Labels: patch
>             Fix For: tools-1.5.3
>
>         Attachments: OPENNLP-495-2.patch, OPENNLP-495-3.patch, OPENNLP-495-4.patch, OPENNLP-495-5.patch, OPENNLP-495-6.patch, OPENNLP-495.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The DictionaryNameFinder always creates prediction Spans of type: default. Since we want to start merging results from several name-finders it makes sense to have them all output the same tag, otherwise it is impossible to properly merge the results. I mean they will be merged but half will be with default tag and some with the other user-specified tag. They can't be evaluated like that...they have to be consistent. That is very easy to fix...All i did was to create a global, mutable, String field and I am checking whether it is null before creating the Span. If it is then the usual happens (you get the default tag), if it isn't however the Span is created with whatever tag the user has supplied. In other words, a simple setter method can be used to set what tag to use in the DictionaryNameFinder Object. Of course this only works for single-type entities but then again dictionaries tend to be single-type "repositories". I am confident that you can commit this soon...all i did was add a field, a setter method for that field and an 'if statement' before creating the Span. What can possibly go wrong?
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Updated] (OPENNLP-495) DictionaryNameFinder only outputs Spans of type "default"

Posted by "Jim Piliouras (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OPENNLP-495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Piliouras updated OPENNLP-495:
----------------------------------

    Attachment: OPENNLP-495-3.patch

Ok this is indeed the nicest and simplest version...Got rid of the setter method and instead i created a second constructor which takes 1 extra argument (the type string) and initialises a global variable. Unless the second constructor is called, that global variable will remain null for the rest of the object's life. Otherwise it will change to whatever the user has provided. I kept the getter method for the type though cos i thought it may be useful to be able to ask the name-finder what type tag it is producing. If you like this version i can upload the whole final class...
                
> DictionaryNameFinder only outputs Spans of type "default"
> ---------------------------------------------------------
>
>                 Key: OPENNLP-495
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-495
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Name Finder
>    Affects Versions: tools-1.5.3
>         Environment: Ubuntu x64 Java 7 update 3
>            Reporter: Jim Piliouras
>              Labels: patch
>             Fix For: tools-1.5.3
>
>         Attachments: DictionaryNameFinder.java, OPENNLP-495-2.patch, OPENNLP-495-3.patch, OPENNLP-495.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The DictionaryNameFinder always creates prediction Spans of type: default. Since we want to start merging results from several name-finders it makes sense to have them all output the same tag, otherwise it is impossible to properly merge the results. I mean they will be merged but half will be with default tag and some with the other user-specified tag. They can't be evaluated like that...they have to be consistent. That is very easy to fix...All i did was to create a global, mutable, String field and I am checking whether it is null before creating the Span. If it is then the usual happens (you get the default tag), if it isn't however the Span is created with whatever tag the user has supplied. In other words, a simple setter method can be used to set what tag to use in the DictionaryNameFinder Object. Of course this only works for single-type entities but then again dictionaries tend to be single-type "repositories". I am confident that you can commit this soon...all i did was add a field, a setter method for that field and an 'if statement' before creating the Span. What can possibly go wrong?
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (OPENNLP-495) DictionaryNameFinder only outputs Spans of type "default"

Posted by "Joern Kottmann (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252367#comment-13252367 ] 

Joern Kottmann commented on OPENNLP-495:
----------------------------------------

Right click on the file and then Team -> "Create patch..."
                
> DictionaryNameFinder only outputs Spans of type "default"
> ---------------------------------------------------------
>
>                 Key: OPENNLP-495
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-495
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Name Finder
>    Affects Versions: tools-1.5.3
>         Environment: Ubuntu x64 Java 7 update 3
>            Reporter: Jim Piliouras
>              Labels: patch
>             Fix For: tools-1.5.3
>
>         Attachments: DictionaryNameFinder.java, OPENNLP-495-2.patch, OPENNLP-495-3.patch, OPENNLP-495.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The DictionaryNameFinder always creates prediction Spans of type: default. Since we want to start merging results from several name-finders it makes sense to have them all output the same tag, otherwise it is impossible to properly merge the results. I mean they will be merged but half will be with default tag and some with the other user-specified tag. They can't be evaluated like that...they have to be consistent. That is very easy to fix...All i did was to create a global, mutable, String field and I am checking whether it is null before creating the Span. If it is then the usual happens (you get the default tag), if it isn't however the Span is created with whatever tag the user has supplied. In other words, a simple setter method can be used to set what tag to use in the DictionaryNameFinder Object. Of course this only works for single-type entities but then again dictionaries tend to be single-type "repositories". I am confident that you can commit this soon...all i did was add a field, a setter method for that field and an 'if statement' before creating the Span. What can possibly go wrong?
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Updated] (OPENNLP-495) DictionaryNameFinder only outputs Spans of type "default"

Posted by "Jim Piliouras (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OPENNLP-495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Piliouras updated OPENNLP-495:
----------------------------------

    Attachment: OPENNLP-495-5.patch

Patch number 5 is really tiny...I think this is what you need.
                
> DictionaryNameFinder only outputs Spans of type "default"
> ---------------------------------------------------------
>
>                 Key: OPENNLP-495
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-495
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Name Finder
>    Affects Versions: tools-1.5.3
>         Environment: Ubuntu x64 Java 7 update 3
>            Reporter: Jim Piliouras
>              Labels: patch
>             Fix For: tools-1.5.3
>
>         Attachments: DictionaryNameFinder.java, OPENNLP-495-2.patch, OPENNLP-495-3.patch, OPENNLP-495-4.patch, OPENNLP-495-5.patch, OPENNLP-495.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The DictionaryNameFinder always creates prediction Spans of type: default. Since we want to start merging results from several name-finders it makes sense to have them all output the same tag, otherwise it is impossible to properly merge the results. I mean they will be merged but half will be with default tag and some with the other user-specified tag. They can't be evaluated like that...they have to be consistent. That is very easy to fix...All i did was to create a global, mutable, String field and I am checking whether it is null before creating the Span. If it is then the usual happens (you get the default tag), if it isn't however the Span is created with whatever tag the user has supplied. In other words, a simple setter method can be used to set what tag to use in the DictionaryNameFinder Object. Of course this only works for single-type entities but then again dictionaries tend to be single-type "repositories". I am confident that you can commit this soon...all i did was add a field, a setter method for that field and an 'if statement' before creating the Span. What can possibly go wrong?
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (OPENNLP-495) DictionaryNameFinder only outputs Spans of type "default"

Posted by "Joern Kottmann (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252267#comment-13252267 ] 

Joern Kottmann commented on OPENNLP-495:
----------------------------------------

If you pass in the type in the first patch via the constructor you can assign it a field. If the user decides to pass in null or just uses the old constructor this field could be initialized to "default". Anyway how much faster is the second version to the first one? I doubt that the boolean you added speeds up anything.
                
> DictionaryNameFinder only outputs Spans of type "default"
> ---------------------------------------------------------
>
>                 Key: OPENNLP-495
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-495
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Name Finder
>    Affects Versions: tools-1.5.3
>         Environment: Ubuntu x64 Java 7 update 3
>            Reporter: Jim Piliouras
>              Labels: patch
>             Fix For: tools-1.5.3
>
>         Attachments: DictionaryNameFinder.java, OPENNLP-495-2.patch, OPENNLP-495.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The DictionaryNameFinder always creates prediction Spans of type: default. Since we want to start merging results from several name-finders it makes sense to have them all output the same tag, otherwise it is impossible to properly merge the results. I mean they will be merged but half will be with default tag and some with the other user-specified tag. They can't be evaluated like that...they have to be consistent. That is very easy to fix...All i did was to create a global, mutable, String field and I am checking whether it is null before creating the Span. If it is then the usual happens (you get the default tag), if it isn't however the Span is created with whatever tag the user has supplied. In other words, a simple setter method can be used to set what tag to use in the DictionaryNameFinder Object. Of course this only works for single-type entities but then again dictionaries tend to be single-type "repositories". I am confident that you can commit this soon...all i did was add a field, a setter method for that field and an 'if statement' before creating the Span. What can possibly go wrong?
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (OPENNLP-495) DictionaryNameFinder only outputs Spans of type "default"

Posted by "Jim Piliouras (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252407#comment-13252407 ] 

Jim Piliouras commented on OPENNLP-495:
---------------------------------------

Ok you're right there are differences between the .find() method i was working on and the new .find() method. I fixed that  - i'm now using the .find() code from the new revision so the bug has been addressed. However, you mentioned that the overloaded .find() method is unnecessary. We were discussing this morning about this - without this method we check for null in every single prediction. This extra computation will accumulate for large texts. What do you want me to do? The new patch i've got (not yet uploaded) moved all the new code at the bottom (kept original ordering of methods), removed my comments and addressed the bug-fix. However i still have the overloaded .find() method (at the end).

Submit it or revert to the "one .find() method" aaproach?
                
> DictionaryNameFinder only outputs Spans of type "default"
> ---------------------------------------------------------
>
>                 Key: OPENNLP-495
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-495
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Name Finder
>    Affects Versions: tools-1.5.3
>         Environment: Ubuntu x64 Java 7 update 3
>            Reporter: Jim Piliouras
>              Labels: patch
>             Fix For: tools-1.5.3
>
>         Attachments: DictionaryNameFinder.java, OPENNLP-495-2.patch, OPENNLP-495-3.patch, OPENNLP-495-4.patch, OPENNLP-495.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The DictionaryNameFinder always creates prediction Spans of type: default. Since we want to start merging results from several name-finders it makes sense to have them all output the same tag, otherwise it is impossible to properly merge the results. I mean they will be merged but half will be with default tag and some with the other user-specified tag. They can't be evaluated like that...they have to be consistent. That is very easy to fix...All i did was to create a global, mutable, String field and I am checking whether it is null before creating the Span. If it is then the usual happens (you get the default tag), if it isn't however the Span is created with whatever tag the user has supplied. In other words, a simple setter method can be used to set what tag to use in the DictionaryNameFinder Object. Of course this only works for single-type entities but then again dictionaries tend to be single-type "repositories". I am confident that you can commit this soon...all i did was add a field, a setter method for that field and an 'if statement' before creating the Span. What can possibly go wrong?
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (OPENNLP-495) DictionaryNameFinder only outputs Spans of type "default"

Posted by "Joern Kottmann (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13251577#comment-13251577 ] 

Joern Kottmann commented on OPENNLP-495:
----------------------------------------

Please provide a patch file, so we can easily review your changes and apply them.
                
> DictionaryNameFinder only outputs Spans of type "default"
> ---------------------------------------------------------
>
>                 Key: OPENNLP-495
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-495
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Name Finder
>    Affects Versions: tools-1.5.3
>         Environment: Ubuntu x64 Java 7 update 3
>            Reporter: Jim Piliouras
>              Labels: patch
>             Fix For: tools-1.5.3
>
>         Attachments: DictionaryNameFinder.java
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The DictionaryNameFinder always creates prediction Spans of type: default. Since we want to start merging results from several name-finders it makes sense to have them all output the same tag, otherwise it is impossible to properly merge the results. I mean they will be merged but half will be with default tag and some with the other user-specified tag. They can't be evaluated like that...they have to be consistent. That is very easy to fix...All i did was to create a global, mutable, String field and I am checking whether it is null before creating the Span. If it is then the usual happens (you get the default tag), if it isn't however the Span is created with whatever tag the user has supplied. In other words, a simple setter method can be used to set what tag to use in the DictionaryNameFinder Object. Of course this only works for single-type entities but then again dictionaries tend to be single-type "repositories". I am confident that you can commit this soon...all i did was add a field, a setter method for that field and an 'if statement' before creating the Span. What can possibly go wrong?
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (OPENNLP-495) DictionaryNameFinder only outputs Spans of type "default"

Posted by "Jim Piliouras (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252379#comment-13252379 ] 

Jim Piliouras commented on OPENNLP-495:
---------------------------------------

Only have "apply patch" on my eclipse indigo..anyway i used the eclipse code formatter and saved it to a file where i can do the diff... 
                
> DictionaryNameFinder only outputs Spans of type "default"
> ---------------------------------------------------------
>
>                 Key: OPENNLP-495
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-495
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Name Finder
>    Affects Versions: tools-1.5.3
>         Environment: Ubuntu x64 Java 7 update 3
>            Reporter: Jim Piliouras
>              Labels: patch
>             Fix For: tools-1.5.3
>
>         Attachments: DictionaryNameFinder.java, OPENNLP-495-2.patch, OPENNLP-495-3.patch, OPENNLP-495.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The DictionaryNameFinder always creates prediction Spans of type: default. Since we want to start merging results from several name-finders it makes sense to have them all output the same tag, otherwise it is impossible to properly merge the results. I mean they will be merged but half will be with default tag and some with the other user-specified tag. They can't be evaluated like that...they have to be consistent. That is very easy to fix...All i did was to create a global, mutable, String field and I am checking whether it is null before creating the Span. If it is then the usual happens (you get the default tag), if it isn't however the Span is created with whatever tag the user has supplied. In other words, a simple setter method can be used to set what tag to use in the DictionaryNameFinder Object. Of course this only works for single-type entities but then again dictionaries tend to be single-type "repositories". I am confident that you can commit this soon...all i did was add a field, a setter method for that field and an 'if statement' before creating the Span. What can possibly go wrong?
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (OPENNLP-495) DictionaryNameFinder only outputs Spans of type "default"

Posted by "Jim Piliouras (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252364#comment-13252364 ] 

Jim Piliouras commented on OPENNLP-495:
---------------------------------------

Hmm, the version I'm working on all this time is the one i checked out 10 days or so ago...is there a chance the patch is no longer valid? 
I never use tabs for indentation and generally i do adhere to the Java coding conventions, however when i copy the source from eclipse to gedit to do the diff, some styling is lost..I'll try to do that from within eclipse. It might take a while cos i've never done it before... 
                
> DictionaryNameFinder only outputs Spans of type "default"
> ---------------------------------------------------------
>
>                 Key: OPENNLP-495
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-495
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Name Finder
>    Affects Versions: tools-1.5.3
>         Environment: Ubuntu x64 Java 7 update 3
>            Reporter: Jim Piliouras
>              Labels: patch
>             Fix For: tools-1.5.3
>
>         Attachments: DictionaryNameFinder.java, OPENNLP-495-2.patch, OPENNLP-495-3.patch, OPENNLP-495.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The DictionaryNameFinder always creates prediction Spans of type: default. Since we want to start merging results from several name-finders it makes sense to have them all output the same tag, otherwise it is impossible to properly merge the results. I mean they will be merged but half will be with default tag and some with the other user-specified tag. They can't be evaluated like that...they have to be consistent. That is very easy to fix...All i did was to create a global, mutable, String field and I am checking whether it is null before creating the Span. If it is then the usual happens (you get the default tag), if it isn't however the Span is created with whatever tag the user has supplied. In other words, a simple setter method can be used to set what tag to use in the DictionaryNameFinder Object. Of course this only works for single-type entities but then again dictionaries tend to be single-type "repositories". I am confident that you can commit this soon...all i did was add a field, a setter method for that field and an 'if statement' before creating the Span. What can possibly go wrong?
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Updated] (OPENNLP-495) DictionaryNameFinder only outputs Spans of type "default"

Posted by "Jim Piliouras (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OPENNLP-495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Piliouras updated OPENNLP-495:
----------------------------------

    Attachment: DictionaryNameFinder.java

Very lighly patched this class to allow the user to be able to set what 'type' of spans this name-finder should produce.
                
> DictionaryNameFinder only outputs Spans of type "default"
> ---------------------------------------------------------
>
>                 Key: OPENNLP-495
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-495
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Name Finder
>    Affects Versions: tools-1.5.3
>         Environment: Ubuntu x64 Java 7 update 3
>            Reporter: Jim Piliouras
>              Labels: patch
>             Fix For: tools-1.5.3
>
>         Attachments: DictionaryNameFinder.java
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The DictionaryNameFinder always creates prediction Spans of type: default. Since we want to start merging results from several name-finders it makes sense to have them all output the same tag, otherwise it is impossible to properly merge the results. I mean they will be merged but half will be with default tag and some with the other user-specified tag. They can't be evaluated like that...they have to be consistent. That is very easy to fix...All i did was to create a global, mutable, String field and I am checking whether it is null before creating the Span. If it is then the usual happens (you get the default tag), if it isn't however the Span is created with whatever tag the user has supplied. In other words, a simple setter method can be used to set what tag to use in the DictionaryNameFinder Object. Of course this only works for single-type entities but then again dictionaries tend to be single-type "repositories". I am confident that you can commit this soon...all i did was add a field, a setter method for that field and an 'if statement' before creating the Span. What can possibly go wrong?
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Updated] (OPENNLP-495) DictionaryNameFinder only outputs Spans of type "default"

Posted by "Jim Piliouras (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OPENNLP-495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Piliouras updated OPENNLP-495:
----------------------------------

    Attachment: OPENNLP-495.patch

oops! corrected the name...
                
> DictionaryNameFinder only outputs Spans of type "default"
> ---------------------------------------------------------
>
>                 Key: OPENNLP-495
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-495
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Name Finder
>    Affects Versions: tools-1.5.3
>         Environment: Ubuntu x64 Java 7 update 3
>            Reporter: Jim Piliouras
>              Labels: patch
>             Fix For: tools-1.5.3
>
>         Attachments: DictionaryNameFinder.java, OPENNLP-495.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The DictionaryNameFinder always creates prediction Spans of type: default. Since we want to start merging results from several name-finders it makes sense to have them all output the same tag, otherwise it is impossible to properly merge the results. I mean they will be merged but half will be with default tag and some with the other user-specified tag. They can't be evaluated like that...they have to be consistent. That is very easy to fix...All i did was to create a global, mutable, String field and I am checking whether it is null before creating the Span. If it is then the usual happens (you get the default tag), if it isn't however the Span is created with whatever tag the user has supplied. In other words, a simple setter method can be used to set what tag to use in the DictionaryNameFinder Object. Of course this only works for single-type entities but then again dictionaries tend to be single-type "repositories". I am confident that you can commit this soon...all i did was add a field, a setter method for that field and an 'if statement' before creating the Span. What can possibly go wrong?
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Updated] (OPENNLP-495) DictionaryNameFinder only outputs Spans of type "default"

Posted by "Jim Piliouras (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OPENNLP-495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Piliouras updated OPENNLP-495:
----------------------------------

    Attachment: OPENNLP-495-2.patch

In my previous patch the wholeidea was not to change or rearrange the code too much...However this led to a major inefficiency because a null check has to be performed for every prediction (due to the nested for loops inside .find() method). It took quite a bit of rearranging to address this...The most prominant change was moving all the code from the find() method to an overloaded version of find that takes an extra argument - the type string. In order for this to work i needed an extra boolean variable that keeps track of whether the current tag is the default or not. Nothing special really just a bunch of rearranging and now we only check once for the tag...Once for every sentence that is - NOT for the whole document (the find method deals with sentences typically) ...

This 2nd patch is the improved and efficient version while the 1st was is the one with the minimum code change/rearranging. 
                
> DictionaryNameFinder only outputs Spans of type "default"
> ---------------------------------------------------------
>
>                 Key: OPENNLP-495
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-495
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Name Finder
>    Affects Versions: tools-1.5.3
>         Environment: Ubuntu x64 Java 7 update 3
>            Reporter: Jim Piliouras
>              Labels: patch
>             Fix For: tools-1.5.3
>
>         Attachments: DictionaryNameFinder.java, OPENNLP-495-2.patch, OPENNLP-495.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The DictionaryNameFinder always creates prediction Spans of type: default. Since we want to start merging results from several name-finders it makes sense to have them all output the same tag, otherwise it is impossible to properly merge the results. I mean they will be merged but half will be with default tag and some with the other user-specified tag. They can't be evaluated like that...they have to be consistent. That is very easy to fix...All i did was to create a global, mutable, String field and I am checking whether it is null before creating the Span. If it is then the usual happens (you get the default tag), if it isn't however the Span is created with whatever tag the user has supplied. In other words, a simple setter method can be used to set what tag to use in the DictionaryNameFinder Object. Of course this only works for single-type entities but then again dictionaries tend to be single-type "repositories". I am confident that you can commit this soon...all i did was add a field, a setter method for that field and an 'if statement' before creating the Span. What can possibly go wrong?
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (OPENNLP-495) DictionaryNameFinder only outputs Spans of type "default"

Posted by "Joern Kottmann (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13251615#comment-13251615 ] 

Joern Kottmann commented on OPENNLP-495:
----------------------------------------

You attached the entire source file and not a patch.

Have a look here:
http://en.wikipedia.org/wiki/Patch_(Unix)
http://www.castor.org/how-to-prepare-a-patch.html

Your IDE likely has support for creating a patch. We usually name them after the jira issue e.g. OPENNLP-495.patch and then add a counter when multiple patches are submitted or something was changed, e.g. OPENNLP-495-2.patch, OPENNLP-495-3.patch. 


                
> DictionaryNameFinder only outputs Spans of type "default"
> ---------------------------------------------------------
>
>                 Key: OPENNLP-495
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-495
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Name Finder
>    Affects Versions: tools-1.5.3
>         Environment: Ubuntu x64 Java 7 update 3
>            Reporter: Jim Piliouras
>              Labels: patch
>             Fix For: tools-1.5.3
>
>         Attachments: DictionaryNameFinder.java
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The DictionaryNameFinder always creates prediction Spans of type: default. Since we want to start merging results from several name-finders it makes sense to have them all output the same tag, otherwise it is impossible to properly merge the results. I mean they will be merged but half will be with default tag and some with the other user-specified tag. They can't be evaluated like that...they have to be consistent. That is very easy to fix...All i did was to create a global, mutable, String field and I am checking whether it is null before creating the Span. If it is then the usual happens (you get the default tag), if it isn't however the Span is created with whatever tag the user has supplied. In other words, a simple setter method can be used to set what tag to use in the DictionaryNameFinder Object. Of course this only works for single-type entities but then again dictionaries tend to be single-type "repositories". I am confident that you can commit this soon...all i did was add a field, a setter method for that field and an 'if statement' before creating the Span. What can possibly go wrong?
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (OPENNLP-495) DictionaryNameFinder only outputs Spans of type "default"

Posted by "Jim Piliouras (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252471#comment-13252471 ] 

Jim Piliouras commented on OPENNLP-495:
---------------------------------------

oops...sorry i meant if you could have a look at the AggragateNameFinder2.java . That is the latest one that does the job.
                
> DictionaryNameFinder only outputs Spans of type "default"
> ---------------------------------------------------------
>
>                 Key: OPENNLP-495
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-495
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Name Finder
>    Affects Versions: tools-1.5.3
>         Environment: Ubuntu x64 Java 7 update 3
>            Reporter: Jim Piliouras
>            Assignee: Joern Kottmann
>              Labels: patch
>             Fix For: tools-1.5.3
>
>         Attachments: OPENNLP-495-2.patch, OPENNLP-495-3.patch, OPENNLP-495-4.patch, OPENNLP-495-5.patch, OPENNLP-495-6.patch, OPENNLP-495.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The DictionaryNameFinder always creates prediction Spans of type: default. Since we want to start merging results from several name-finders it makes sense to have them all output the same tag, otherwise it is impossible to properly merge the results. I mean they will be merged but half will be with default tag and some with the other user-specified tag. They can't be evaluated like that...they have to be consistent. That is very easy to fix...All i did was to create a global, mutable, String field and I am checking whether it is null before creating the Span. If it is then the usual happens (you get the default tag), if it isn't however the Span is created with whatever tag the user has supplied. In other words, a simple setter method can be used to set what tag to use in the DictionaryNameFinder Object. Of course this only works for single-type entities but then again dictionaries tend to be single-type "repositories". I am confident that you can commit this soon...all i did was add a field, a setter method for that field and an 'if statement' before creating the Span. What can possibly go wrong?
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (OPENNLP-495) DictionaryNameFinder only outputs Spans of type "default"

Posted by "Joern Kottmann (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252383#comment-13252383 ] 

Joern Kottmann commented on OPENNLP-495:
----------------------------------------

Maybe you didn't install the subversion plugin? I suggest to get eclipse setup correctly so you can use it to produce the patch.
                
> DictionaryNameFinder only outputs Spans of type "default"
> ---------------------------------------------------------
>
>                 Key: OPENNLP-495
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-495
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Name Finder
>    Affects Versions: tools-1.5.3
>         Environment: Ubuntu x64 Java 7 update 3
>            Reporter: Jim Piliouras
>              Labels: patch
>             Fix For: tools-1.5.3
>
>         Attachments: DictionaryNameFinder.java, OPENNLP-495-2.patch, OPENNLP-495-3.patch, OPENNLP-495-4.patch, OPENNLP-495.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The DictionaryNameFinder always creates prediction Spans of type: default. Since we want to start merging results from several name-finders it makes sense to have them all output the same tag, otherwise it is impossible to properly merge the results. I mean they will be merged but half will be with default tag and some with the other user-specified tag. They can't be evaluated like that...they have to be consistent. That is very easy to fix...All i did was to create a global, mutable, String field and I am checking whether it is null before creating the Span. If it is then the usual happens (you get the default tag), if it isn't however the Span is created with whatever tag the user has supplied. In other words, a simple setter method can be used to set what tag to use in the DictionaryNameFinder Object. Of course this only works for single-type entities but then again dictionaries tend to be single-type "repositories". I am confident that you can commit this soon...all i did was add a field, a setter method for that field and an 'if statement' before creating the Span. What can possibly go wrong?
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (OPENNLP-495) DictionaryNameFinder only outputs Spans of type "default"

Posted by "Jim Piliouras (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252330#comment-13252330 ] 

Jim Piliouras commented on OPENNLP-495:
---------------------------------------

Yes you're right you can pass the type as a constructor argument but that would be a break for backwards compatibility (unless i create a 2nd constructor). Every time i write code for openNLP i try to make as few changes to the original code as possible. The second patch is not really much faster simply  because the find method is called on sentences - NOT on documents, so how big can a sentence be realistically?  It is however more sensible...The speed difference will only be obvious only in the unlikely case that you get a massive sentence to predict for. The first patch will do a null-check for every prediction which is wasteful isn't it? Ideally, as far as the dictionary is concerned, you want to only check once for the whole document, or at least one for every sentence like i did...

Also, don't know if you noticed but the boolean is not really "necessary" ! I could achieve the same without the boolean, it's just that i'd have to play around with null quite a bit which is not definitely not recommended in Java. In other words the boolean is there only for the sake of readability/sensibility. Null can be very nasty in Java because it contains no information about truthiness as it does in Clojure. 
                
> DictionaryNameFinder only outputs Spans of type "default"
> ---------------------------------------------------------
>
>                 Key: OPENNLP-495
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-495
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Name Finder
>    Affects Versions: tools-1.5.3
>         Environment: Ubuntu x64 Java 7 update 3
>            Reporter: Jim Piliouras
>              Labels: patch
>             Fix For: tools-1.5.3
>
>         Attachments: DictionaryNameFinder.java, OPENNLP-495-2.patch, OPENNLP-495.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The DictionaryNameFinder always creates prediction Spans of type: default. Since we want to start merging results from several name-finders it makes sense to have them all output the same tag, otherwise it is impossible to properly merge the results. I mean they will be merged but half will be with default tag and some with the other user-specified tag. They can't be evaluated like that...they have to be consistent. That is very easy to fix...All i did was to create a global, mutable, String field and I am checking whether it is null before creating the Span. If it is then the usual happens (you get the default tag), if it isn't however the Span is created with whatever tag the user has supplied. In other words, a simple setter method can be used to set what tag to use in the DictionaryNameFinder Object. Of course this only works for single-type entities but then again dictionaries tend to be single-type "repositories". I am confident that you can commit this soon...all i did was add a field, a setter method for that field and an 'if statement' before creating the Span. What can possibly go wrong?
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (OPENNLP-495) DictionaryNameFinder only outputs Spans of type "default"

Posted by "Joern Kottmann (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252361#comment-13252361 ] 

Joern Kottmann commented on OPENNLP-495:
----------------------------------------

You patch does not match the current head version. I suggest that you use svn diff to create the patch.

Please also have a look at our code conventions:
http://opennlp.apache.org/code-conventions.html

We use two spaces to indent and no tabs.
                
> DictionaryNameFinder only outputs Spans of type "default"
> ---------------------------------------------------------
>
>                 Key: OPENNLP-495
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-495
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Name Finder
>    Affects Versions: tools-1.5.3
>         Environment: Ubuntu x64 Java 7 update 3
>            Reporter: Jim Piliouras
>              Labels: patch
>             Fix For: tools-1.5.3
>
>         Attachments: DictionaryNameFinder.java, OPENNLP-495-2.patch, OPENNLP-495-3.patch, OPENNLP-495.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The DictionaryNameFinder always creates prediction Spans of type: default. Since we want to start merging results from several name-finders it makes sense to have them all output the same tag, otherwise it is impossible to properly merge the results. I mean they will be merged but half will be with default tag and some with the other user-specified tag. They can't be evaluated like that...they have to be consistent. That is very easy to fix...All i did was to create a global, mutable, String field and I am checking whether it is null before creating the Span. If it is then the usual happens (you get the default tag), if it isn't however the Span is created with whatever tag the user has supplied. In other words, a simple setter method can be used to set what tag to use in the DictionaryNameFinder Object. Of course this only works for single-type entities but then again dictionaries tend to be single-type "repositories". I am confident that you can commit this soon...all i did was add a field, a setter method for that field and an 'if statement' before creating the Span. What can possibly go wrong?
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Updated] (OPENNLP-495) DictionaryNameFinder only outputs Spans of type "default"

Posted by "Jim Piliouras (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OPENNLP-495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Piliouras updated OPENNLP-495:
----------------------------------

    Attachment:     (was: patch.diff)
    
> DictionaryNameFinder only outputs Spans of type "default"
> ---------------------------------------------------------
>
>                 Key: OPENNLP-495
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-495
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Name Finder
>    Affects Versions: tools-1.5.3
>         Environment: Ubuntu x64 Java 7 update 3
>            Reporter: Jim Piliouras
>              Labels: patch
>             Fix For: tools-1.5.3
>
>         Attachments: DictionaryNameFinder.java
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The DictionaryNameFinder always creates prediction Spans of type: default. Since we want to start merging results from several name-finders it makes sense to have them all output the same tag, otherwise it is impossible to properly merge the results. I mean they will be merged but half will be with default tag and some with the other user-specified tag. They can't be evaluated like that...they have to be consistent. That is very easy to fix...All i did was to create a global, mutable, String field and I am checking whether it is null before creating the Span. If it is then the usual happens (you get the default tag), if it isn't however the Span is created with whatever tag the user has supplied. In other words, a simple setter method can be used to set what tag to use in the DictionaryNameFinder Object. Of course this only works for single-type entities but then again dictionaries tend to be single-type "repositories". I am confident that you can commit this soon...all i did was add a field, a setter method for that field and an 'if statement' before creating the Span. What can possibly go wrong?
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Closed] (OPENNLP-495) DictionaryNameFinder only outputs Spans of type "default"

Posted by "Joern Kottmann (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OPENNLP-495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joern Kottmann closed OPENNLP-495.
----------------------------------

    
> DictionaryNameFinder only outputs Spans of type "default"
> ---------------------------------------------------------
>
>                 Key: OPENNLP-495
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-495
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Name Finder
>    Affects Versions: tools-1.5.3
>         Environment: Ubuntu x64 Java 7 update 3
>            Reporter: Jim Piliouras
>            Assignee: Joern Kottmann
>              Labels: patch
>             Fix For: tools-1.5.3
>
>         Attachments: OPENNLP-495-2.patch, OPENNLP-495-3.patch, OPENNLP-495-4.patch, OPENNLP-495-5.patch, OPENNLP-495-6.patch, OPENNLP-495.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The DictionaryNameFinder always creates prediction Spans of type: default. Since we want to start merging results from several name-finders it makes sense to have them all output the same tag, otherwise it is impossible to properly merge the results. I mean they will be merged but half will be with default tag and some with the other user-specified tag. They can't be evaluated like that...they have to be consistent. That is very easy to fix...All i did was to create a global, mutable, String field and I am checking whether it is null before creating the Span. If it is then the usual happens (you get the default tag), if it isn't however the Span is created with whatever tag the user has supplied. In other words, a simple setter method can be used to set what tag to use in the DictionaryNameFinder Object. Of course this only works for single-type entities but then again dictionaries tend to be single-type "repositories". I am confident that you can commit this soon...all i did was add a field, a setter method for that field and an 'if statement' before creating the Span. What can possibly go wrong?
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (OPENNLP-495) DictionaryNameFinder only outputs Spans of type "default"

Posted by "Jim Piliouras (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13251585#comment-13251585 ] 

Jim Piliouras commented on OPENNLP-495:
---------------------------------------


I did didn't I? It should be attached on the jira issue...It is a slight 
modification of the original DictionaryNameFinder.java to include an 
extra private field,  a setter method for that field and a conditional 
statement to decide what tag to use before creating the Span object.


Jim

                
> DictionaryNameFinder only outputs Spans of type "default"
> ---------------------------------------------------------
>
>                 Key: OPENNLP-495
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-495
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Name Finder
>    Affects Versions: tools-1.5.3
>         Environment: Ubuntu x64 Java 7 update 3
>            Reporter: Jim Piliouras
>              Labels: patch
>             Fix For: tools-1.5.3
>
>         Attachments: DictionaryNameFinder.java
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The DictionaryNameFinder always creates prediction Spans of type: default. Since we want to start merging results from several name-finders it makes sense to have them all output the same tag, otherwise it is impossible to properly merge the results. I mean they will be merged but half will be with default tag and some with the other user-specified tag. They can't be evaluated like that...they have to be consistent. That is very easy to fix...All i did was to create a global, mutable, String field and I am checking whether it is null before creating the Span. If it is then the usual happens (you get the default tag), if it isn't however the Span is created with whatever tag the user has supplied. In other words, a simple setter method can be used to set what tag to use in the DictionaryNameFinder Object. Of course this only works for single-type entities but then again dictionaries tend to be single-type "repositories". I am confident that you can commit this soon...all i did was add a field, a setter method for that field and an 'if statement' before creating the Span. What can possibly go wrong?
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Updated] (OPENNLP-495) DictionaryNameFinder only outputs Spans of type "default"

Posted by "Jim Piliouras (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OPENNLP-495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Piliouras updated OPENNLP-495:
----------------------------------

    Attachment: OPENNLP-495-6.patch

Ok did exactly as you said...The way it is setup now the alt_type follows the constructor calls ans is set accordingly. Once it has been set by a constructor there is no way the user change it and it can never be null.  
                
> DictionaryNameFinder only outputs Spans of type "default"
> ---------------------------------------------------------
>
>                 Key: OPENNLP-495
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-495
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Name Finder
>    Affects Versions: tools-1.5.3
>         Environment: Ubuntu x64 Java 7 update 3
>            Reporter: Jim Piliouras
>              Labels: patch
>             Fix For: tools-1.5.3
>
>         Attachments: DictionaryNameFinder.java, OPENNLP-495-2.patch, OPENNLP-495-3.patch, OPENNLP-495-4.patch, OPENNLP-495-5.patch, OPENNLP-495-6.patch, OPENNLP-495.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The DictionaryNameFinder always creates prediction Spans of type: default. Since we want to start merging results from several name-finders it makes sense to have them all output the same tag, otherwise it is impossible to properly merge the results. I mean they will be merged but half will be with default tag and some with the other user-specified tag. They can't be evaluated like that...they have to be consistent. That is very easy to fix...All i did was to create a global, mutable, String field and I am checking whether it is null before creating the Span. If it is then the usual happens (you get the default tag), if it isn't however the Span is created with whatever tag the user has supplied. In other words, a simple setter method can be used to set what tag to use in the DictionaryNameFinder Object. Of course this only works for single-type entities but then again dictionaries tend to be single-type "repositories". I am confident that you can commit this soon...all i did was add a field, a setter method for that field and an 'if statement' before creating the Span. What can possibly go wrong?
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (OPENNLP-495) DictionaryNameFinder only outputs Spans of type "default"

Posted by "Joern Kottmann (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252387#comment-13252387 ] 

Joern Kottmann commented on OPENNLP-495:
----------------------------------------

There are still a couple of issue here. The patch can be applied now, but it should just change like 10 lines of code (adding a new constructor, referencing an instance variable instead of a constant in one place).

But it does now reformat everything, reordering methods, adding a not-needed method and most importantly removed a bug fix recently done (OPENNLP-471) to this class.

Please provide a new patch which just changes what is necessary.
                
> DictionaryNameFinder only outputs Spans of type "default"
> ---------------------------------------------------------
>
>                 Key: OPENNLP-495
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-495
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Name Finder
>    Affects Versions: tools-1.5.3
>         Environment: Ubuntu x64 Java 7 update 3
>            Reporter: Jim Piliouras
>              Labels: patch
>             Fix For: tools-1.5.3
>
>         Attachments: DictionaryNameFinder.java, OPENNLP-495-2.patch, OPENNLP-495-3.patch, OPENNLP-495-4.patch, OPENNLP-495.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The DictionaryNameFinder always creates prediction Spans of type: default. Since we want to start merging results from several name-finders it makes sense to have them all output the same tag, otherwise it is impossible to properly merge the results. I mean they will be merged but half will be with default tag and some with the other user-specified tag. They can't be evaluated like that...they have to be consistent. That is very easy to fix...All i did was to create a global, mutable, String field and I am checking whether it is null before creating the Span. If it is then the usual happens (you get the default tag), if it isn't however the Span is created with whatever tag the user has supplied. In other words, a simple setter method can be used to set what tag to use in the DictionaryNameFinder Object. Of course this only works for single-type entities but then again dictionaries tend to be single-type "repositories". I am confident that you can commit this soon...all i did was add a field, a setter method for that field and an 'if statement' before creating the Span. What can possibly go wrong?
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (OPENNLP-495) DictionaryNameFinder only outputs Spans of type "default"

Posted by "Joern Kottmann (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252453#comment-13252453 ] 

Joern Kottmann commented on OPENNLP-495:
----------------------------------------

Applied the patch now with some modifications. You should not create a new instance of the DictioanryNameFinder from its constructor. With calling a constructor I meant using "this(...)".
                
> DictionaryNameFinder only outputs Spans of type "default"
> ---------------------------------------------------------
>
>                 Key: OPENNLP-495
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-495
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Name Finder
>    Affects Versions: tools-1.5.3
>         Environment: Ubuntu x64 Java 7 update 3
>            Reporter: Jim Piliouras
>              Labels: patch
>             Fix For: tools-1.5.3
>
>         Attachments: DictionaryNameFinder.java, OPENNLP-495-2.patch, OPENNLP-495-3.patch, OPENNLP-495-4.patch, OPENNLP-495-5.patch, OPENNLP-495-6.patch, OPENNLP-495.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The DictionaryNameFinder always creates prediction Spans of type: default. Since we want to start merging results from several name-finders it makes sense to have them all output the same tag, otherwise it is impossible to properly merge the results. I mean they will be merged but half will be with default tag and some with the other user-specified tag. They can't be evaluated like that...they have to be consistent. That is very easy to fix...All i did was to create a global, mutable, String field and I am checking whether it is null before creating the Span. If it is then the usual happens (you get the default tag), if it isn't however the Span is created with whatever tag the user has supplied. In other words, a simple setter method can be used to set what tag to use in the DictionaryNameFinder Object. Of course this only works for single-type entities but then again dictionaries tend to be single-type "repositories". I am confident that you can commit this soon...all i did was add a field, a setter method for that field and an 'if statement' before creating the Span. What can possibly go wrong?
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (OPENNLP-495) DictionaryNameFinder only outputs Spans of type "default"

Posted by "Jim Piliouras (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252456#comment-13252456 ] 

Jim Piliouras commented on OPENNLP-495:
---------------------------------------

aaaaa of course!!!! i had completely forgotten about using "this()"...thanks for reminding me...
                
> DictionaryNameFinder only outputs Spans of type "default"
> ---------------------------------------------------------
>
>                 Key: OPENNLP-495
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-495
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Name Finder
>    Affects Versions: tools-1.5.3
>         Environment: Ubuntu x64 Java 7 update 3
>            Reporter: Jim Piliouras
>            Assignee: Joern Kottmann
>              Labels: patch
>             Fix For: tools-1.5.3
>
>         Attachments: DictionaryNameFinder.java, OPENNLP-495-2.patch, OPENNLP-495-3.patch, OPENNLP-495-4.patch, OPENNLP-495-5.patch, OPENNLP-495-6.patch, OPENNLP-495.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The DictionaryNameFinder always creates prediction Spans of type: default. Since we want to start merging results from several name-finders it makes sense to have them all output the same tag, otherwise it is impossible to properly merge the results. I mean they will be merged but half will be with default tag and some with the other user-specified tag. They can't be evaluated like that...they have to be consistent. That is very easy to fix...All i did was to create a global, mutable, String field and I am checking whether it is null before creating the Span. If it is then the usual happens (you get the default tag), if it isn't however the Span is created with whatever tag the user has supplied. In other words, a simple setter method can be used to set what tag to use in the DictionaryNameFinder Object. Of course this only works for single-type entities but then again dictionaries tend to be single-type "repositories". I am confident that you can commit this soon...all i did was add a field, a setter method for that field and an 'if statement' before creating the Span. What can possibly go wrong?
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (OPENNLP-495) DictionaryNameFinder only outputs Spans of type "default"

Posted by "Jim Piliouras (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OPENNLP-495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252338#comment-13252338 ] 

Jim Piliouras commented on OPENNLP-495:
---------------------------------------

It is not 'bad' per ce, but rather "risky" simply because null has very strong bonds with the most famous exception (NullPointerException)...I tend not to use null checks in Java conditionals but rather "try-catch" statements, whereas in Clojure "nil" evaluates to false thus giving you the confidence to use it in "if" and "if-let" statements without worrying about any consequences. Anyway, as i said it will only be noticeably faster when unrealistically long sentences are passed to the .find() method, but that still is a possibility. I'lll sort the 2nd constructor now...Thanks for the feedback...
                
> DictionaryNameFinder only outputs Spans of type "default"
> ---------------------------------------------------------
>
>                 Key: OPENNLP-495
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-495
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Name Finder
>    Affects Versions: tools-1.5.3
>         Environment: Ubuntu x64 Java 7 update 3
>            Reporter: Jim Piliouras
>              Labels: patch
>             Fix For: tools-1.5.3
>
>         Attachments: DictionaryNameFinder.java, OPENNLP-495-2.patch, OPENNLP-495.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The DictionaryNameFinder always creates prediction Spans of type: default. Since we want to start merging results from several name-finders it makes sense to have them all output the same tag, otherwise it is impossible to properly merge the results. I mean they will be merged but half will be with default tag and some with the other user-specified tag. They can't be evaluated like that...they have to be consistent. That is very easy to fix...All i did was to create a global, mutable, String field and I am checking whether it is null before creating the Span. If it is then the usual happens (you get the default tag), if it isn't however the Span is created with whatever tag the user has supplied. In other words, a simple setter method can be used to set what tag to use in the DictionaryNameFinder Object. Of course this only works for single-type entities but then again dictionaries tend to be single-type "repositories". I am confident that you can commit this soon...all i did was add a field, a setter method for that field and an 'if statement' before creating the Span. What can possibly go wrong?
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira