You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@opennlp.apache.org by Mark G <gi...@gmail.com> on 2014/05/07 02:18:28 UTC

TokenNameFinder and Span probs

I am currently working on a project in which we are using NER to to pass
toponyms into the GeoEntityLinker addon for geotagging and I am passing on
the locations, entities, and other info into SOLR for indexing. Over the
years I have noticed that the TokenNameFinder interface does not include
all the probs() methods that the NameFinderME has, and furthermore the Span
object does not have a double field for storing a prob for itself.  Also
the sentenceDetector has a method called getSentenceProbabilities rather
than probs().
When I pass the Spans into the GeoEntityLinker/EntityLinker I can't get the
probs anymore because they are not in the Span objects. I can always extend
Span and add the field, or keep a 2D array of the probs for each sentence,
but wanted to see what everyone thinks about
1. adding the probs methods to the TokenNameFinder interface
2. adding a prob field to Span (a double)
3. Having the NameFinder return the prob with each Span so it doesn't have
to be set after the call to find() using the double[] of probs
4. Have the sentencedetectorME return its spans with a prob, add probs()
method to the SentenceDetector interface, and deprecate the
getSentenceProbabilities...

Thoughts?

Re: TokenNameFinder and Span probs

Posted by Mark G <gi...@gmail.com>.
I'll be working on this the next few days, I'll put in different tickets to
cover the changes to NameFinder and SentenceDetector.


On Wed, May 7, 2014 at 3:22 AM, Joern Kottmann <ko...@gmail.com> wrote:

> Hello Mark,
>
> +1 for your second solution. I believe that is much more intuitive than
> calling a method afterwards to retrieve the prob for a Span.
> it is easier to use because the prob is delivered as part of the result and
> no user action is required to obtain it.
>
> We could use this solution everywhere where a span gets returned.
>
> Jörn
>
>
>
> On Wed, May 7, 2014 at 2:18 AM, Mark G <gi...@gmail.com> wrote:
>
> > I am currently working on a project in which we are using NER to to pass
> > toponyms into the GeoEntityLinker addon for geotagging and I am passing
> on
> > the locations, entities, and other info into SOLR for indexing. Over the
> > years I have noticed that the TokenNameFinder interface does not include
> > all the probs() methods that the NameFinderME has, and furthermore the
> Span
> > object does not have a double field for storing a prob for itself.  Also
> > the sentenceDetector has a method called getSentenceProbabilities rather
> > than probs().
> > When I pass the Spans into the GeoEntityLinker/EntityLinker I can't get
> the
> > probs anymore because they are not in the Span objects. I can always
> extend
> > Span and add the field, or keep a 2D array of the probs for each
> sentence,
> > but wanted to see what everyone thinks about
> > 1. adding the probs methods to the TokenNameFinder interface
> > 2. adding a prob field to Span (a double)
> > 3. Having the NameFinder return the prob with each Span so it doesn't
> have
> > to be set after the call to find() using the double[] of probs
> > 4. Have the sentencedetectorME return its spans with a prob, add probs()
> > method to the SentenceDetector interface, and deprecate the
> > getSentenceProbabilities...
> >
> > Thoughts?
> >
>

Re: TokenNameFinder and Span probs

Posted by William Colen <wi...@gmail.com>.
+1 for the second too

Em quarta-feira, 7 de maio de 2014, Joern Kottmann <ko...@gmail.com>
escreveu:

> Hello Mark,
>
> +1 for your second solution. I believe that is much more intuitive than
> calling a method afterwards to retrieve the prob for a Span.
> it is easier to use because the prob is delivered as part of the result and
> no user action is required to obtain it.
>
> We could use this solution everywhere where a span gets returned.
>
> Jörn
>
>
>
> On Wed, May 7, 2014 at 2:18 AM, Mark G <giaconiamark@gmail.com<javascript:;>>
> wrote:
>
> > I am currently working on a project in which we are using NER to to pass
> > toponyms into the GeoEntityLinker addon for geotagging and I am passing
> on
> > the locations, entities, and other info into SOLR for indexing. Over the
> > years I have noticed that the TokenNameFinder interface does not include
> > all the probs() methods that the NameFinderME has, and furthermore the
> Span
> > object does not have a double field for storing a prob for itself.  Also
> > the sentenceDetector has a method called getSentenceProbabilities rather
> > than probs().
> > When I pass the Spans into the GeoEntityLinker/EntityLinker I can't get
> the
> > probs anymore because they are not in the Span objects. I can always
> extend
> > Span and add the field, or keep a 2D array of the probs for each
> sentence,
> > but wanted to see what everyone thinks about
> > 1. adding the probs methods to the TokenNameFinder interface
> > 2. adding a prob field to Span (a double)
> > 3. Having the NameFinder return the prob with each Span so it doesn't
> have
> > to be set after the call to find() using the double[] of probs
> > 4. Have the sentencedetectorME return its spans with a prob, add probs()
> > method to the SentenceDetector interface, and deprecate the
> > getSentenceProbabilities...
> >
> > Thoughts?
> >
>


-- 
William Colen

Re: TokenNameFinder and Span probs

Posted by Rodrigo Agerri <ag...@gmail.com>.
+1 to the second solution too, and to use this solution everywhere where a Span
object is returned. 

Rodrigo

On 2014/05/07 at 09:22, Joern Kottmann wrote:
> Hello Mark,
> 
> +1 for your second solution. I believe that is much more intuitive than
> calling a method afterwards to retrieve the prob for a Span.
> it is easier to use because the prob is delivered as part of the result and
> no user action is required to obtain it.
> 
> We could use this solution everywhere where a span gets returned.
> 
> Jörn
> 
> 
> 
> On Wed, May 7, 2014 at 2:18 AM, Mark G <gi...@gmail.com> wrote:
> 
> > I am currently working on a project in which we are using NER to to pass
> > toponyms into the GeoEntityLinker addon for geotagging and I am passing on
> > the locations, entities, and other info into SOLR for indexing. Over the
> > years I have noticed that the TokenNameFinder interface does not include
> > all the probs() methods that the NameFinderME has, and furthermore the Span
> > object does not have a double field for storing a prob for itself.  Also
> > the sentenceDetector has a method called getSentenceProbabilities rather
> > than probs().
> > When I pass the Spans into the GeoEntityLinker/EntityLinker I can't get the
> > probs anymore because they are not in the Span objects. I can always extend
> > Span and add the field, or keep a 2D array of the probs for each sentence,
> > but wanted to see what everyone thinks about
> > 1. adding the probs methods to the TokenNameFinder interface
> > 2. adding a prob field to Span (a double)
> > 3. Having the NameFinder return the prob with each Span so it doesn't have
> > to be set after the call to find() using the double[] of probs
> > 4. Have the sentencedetectorME return its spans with a prob, add probs()
> > method to the SentenceDetector interface, and deprecate the
> > getSentenceProbabilities...
> >
> > Thoughts?
> >

Re: TokenNameFinder and Span probs

Posted by Joern Kottmann <ko...@gmail.com>.
Hello Mark,

+1 for your second solution. I believe that is much more intuitive than
calling a method afterwards to retrieve the prob for a Span.
it is easier to use because the prob is delivered as part of the result and
no user action is required to obtain it.

We could use this solution everywhere where a span gets returned.

Jörn



On Wed, May 7, 2014 at 2:18 AM, Mark G <gi...@gmail.com> wrote:

> I am currently working on a project in which we are using NER to to pass
> toponyms into the GeoEntityLinker addon for geotagging and I am passing on
> the locations, entities, and other info into SOLR for indexing. Over the
> years I have noticed that the TokenNameFinder interface does not include
> all the probs() methods that the NameFinderME has, and furthermore the Span
> object does not have a double field for storing a prob for itself.  Also
> the sentenceDetector has a method called getSentenceProbabilities rather
> than probs().
> When I pass the Spans into the GeoEntityLinker/EntityLinker I can't get the
> probs anymore because they are not in the Span objects. I can always extend
> Span and add the field, or keep a 2D array of the probs for each sentence,
> but wanted to see what everyone thinks about
> 1. adding the probs methods to the TokenNameFinder interface
> 2. adding a prob field to Span (a double)
> 3. Having the NameFinder return the prob with each Span so it doesn't have
> to be set after the call to find() using the double[] of probs
> 4. Have the sentencedetectorME return its spans with a prob, add probs()
> method to the SentenceDetector interface, and deprecate the
> getSentenceProbabilities...
>
> Thoughts?
>