You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@opennlp.apache.org by Christopher Hansen <Ch...@boldtypenews.com> on 2017/01/03 21:25:45 UTC

OpenNLP, telephone numbers, ticker symbols and URLs

Hello,

OpenNLP is an awesome tool!

Has anyone extended OpenNLP to extract entities such as telephone numbers, ticker symbols and URLs from text?

If no, is there a way to train the system to extract these entities?

Would these tools be of interest if we created them?


Chris

Christopher C. Hansen
Founder
BoldTypeNews<http://www.boldtypenews.com/>
[Icon_final_4]
T: +1 (845) 351-3435


Re: OpenNLP, telephone numbers, ticker symbols and URLs

Posted by Joern Kottmann <ko...@gmail.com>.
Have a look here, we have some pre-defined matches for Phone numbers
and URLs, email.

https://github.com/apache/opennlp/blob/trunk/opennlp-tools/src/main/jav
a/opennlp/tools/namefind/RegexNameFinderFactory.java

J�rn

On Wed, 2017-01-04 at 00:00 +0100, Damiano Porta wrote:
> Hello Chris,
> You do not need an extension. There is the RegexNameFinder that can
> match
> your entities as well
> 
> here:
> https://github.com/apache/opennlp/blob/trunk/opennlp-tools/src/main/j
> ava/opennlp/tools/namefind/RegexNameFinder.java
> 
> Damiano
> 
> 2017-01-03 22:25 GMT+01:00 Christopher Hansen <Chris@boldtypenews.com
> >:
> 
> > Hello,
> > 
> > 
> > 
> > OpenNLP is an awesome tool!
> > 
> > 
> > 
> > Has anyone extended OpenNLP to extract entities such as telephone
> > numbers,
> > ticker symbols and URLs from text?
> > 
> > 
> > 
> > If no, is there a way to train the system to extract these
> > entities?
> > 
> > 
> > 
> > Would these tools be of interest if we created them?
> > 
> > 
> > 
> > 
> > 
> > Chris
> > 
> > 
> > 
> > *Christopher C. Hansen*
> > 
> > Founder
> > 
> > BoldTypeNews <http://www.boldtypenews.com/>
> > 
> > [image: Icon_final_4]
> > 
> > T: +1 (845) 351-3435 <(845)%20351-3435>
> > 
> > 
> > 

Re: OpenNLP, telephone numbers, ticker symbols and URLs

Posted by Damiano Porta <da...@gmail.com>.
Hello Chris,
You do not need an extension. There is the RegexNameFinder that can match
your entities as well

here:
https://github.com/apache/opennlp/blob/trunk/opennlp-tools/src/main/java/opennlp/tools/namefind/RegexNameFinder.java

Damiano

2017-01-03 22:25 GMT+01:00 Christopher Hansen <Ch...@boldtypenews.com>:

> Hello,
>
>
>
> OpenNLP is an awesome tool!
>
>
>
> Has anyone extended OpenNLP to extract entities such as telephone numbers,
> ticker symbols and URLs from text?
>
>
>
> If no, is there a way to train the system to extract these entities?
>
>
>
> Would these tools be of interest if we created them?
>
>
>
>
>
> Chris
>
>
>
> *Christopher C. Hansen*
>
> Founder
>
> BoldTypeNews <http://www.boldtypenews.com/>
>
> [image: Icon_final_4]
>
> T: +1 (845) 351-3435 <(845)%20351-3435>
>
>
>