You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@opennlp.apache.org by "Miller, Timothy" <Ti...@childrens.harvard.edu> on 2014/06/19 18:00:39 UTC

interest in new parser?

There is a paper at this year's ACL conference on a statistical parser
with some interesting properties [1]. I tracked down the software [2]
and it is apache-licensed (unlike most other high quality parsers such
as the Berkeley and Stanford parsers). It is written in Scala so in
theory it should be compatible. Most importantly it is about as accurate
as those state of the art parsers on English (about 33% error reduction
from the Ratnaparkhi parser that opennlp currently uses), and may be
superior for cross-language performance.

I am going to play with it with some of our clinical data to get a feel
for speed/accuracy on clinical text. Just curious if there is any
interest in a wrapper for this parser in opennlp?

[1] Paper link:
http://69.195.124.161/~aclwebor/anthology///P/P14/P14-1022.pdf
[2] Software: https://github.com/dlwh/epic

-- 
Tim Miller
Instructor
Boston Children's Hospital and Harvard Medical School
timothy.miller@childrens.harvard.edu
617-919-1223

Re: interest in new parser?

Posted by Jörn Kottmann <ko...@gmail.com>.

+1

Receiving Scala contributions should be fine too, but it we then should 
make it
usable for Java applications.

Jörn

On 06/23/2014 10:44 PM, Chen, Pei wrote:
> If it looks promising, we can probably approach the developers to see if they would be interesting in porting over the entire code.
> They will probably be open to it especially if we're willing to help maintain it...

RE: interest in new parser?

Posted by "Chen, Pei" <Pe...@childrens.harvard.edu>.

Tim,
If it looks promising, we can probably approach the developers to see if they would be interesting in porting over the entire code.
They will probably be open to it especially if we're willing to help maintain it...

> -----Original Message-----
> From: Jörn Kottmann [mailto:kottmann@gmail.com]
> Sent: Monday, June 23, 2014 8:09 AM
> To: dev@opennlp.apache.org
> Subject: Re: interest in new parser?
> 
> +1, it would be possible to include different styles and
> implementations of parsers in OpenNLP.
> 
> Jörn
> 
> On 06/23/2014 01:12 PM, Rodrigo Agerri wrote:
> > Hello,
> >
> > Ratnapharki's (1999) is a shift-reduced parser. Others like Stanford
> > NLP are now releasing shift-reduced parsers. There are differences
> > between them, though. For example, Zhang and Clark (2009)'s parser
> > (cited by Stanford's new parser) is similar except that they use a
> > global discriminative model applying Collins (2002) perceptron,
> > whereas Ratnaparkhi’s parser has separate probabilities of actions
> > chained together in a conditional model (based on ME).
> >
> > Perhaps that route, among others, would be an interesting one to have
> > a new parser in opennlp.
> >
> > Cheers,
> >
> > Rodrigo
> >
> >
> >
> > On Sun, Jun 22, 2014 at 9:44 PM, Richard Eckart de Castilho
> > <ri...@gmail.com> wrote:
> >> Some time ago I asked the mstparser developers if they would consider
> >> contributing the parser to OpenNLP. They said that mstparser isn't
> >> up-to-date anymore since better parsers are now available, but in
> >> principle didn't reject the idea.
> >>
> >> If OpenNLP was interested in adopting the mstparser, that might be
> >> something to follow up on.
> >>
> >> The mstparser is only a dependency parser, not a constituency parser
> >> as the one currently included with OpenNLP.
> >>
> >> Cheers,
> >>
> >> -- Richard
> >>
> >> On 20.06.2014, at 09:33, Jörn Kottmann <ko...@gmail.com> wrote:
> >>
> >>> On 06/19/2014 06:00 PM, Miller, Timothy wrote:
> >>>> There is a paper at this year's ACL conference on a statistical
> >>>> parser with some interesting properties [1]. I tracked down the
> >>>> software [2] and it is apache-licensed (unlike most other high
> >>>> quality parsers such as the Berkeley and Stanford parsers). It is
> >>>> written in Scala so in theory it should be compatible. Most
> >>>> importantly it is about as accurate as those state of the art
> >>>> parsers on English (about 33% error reduction from the Ratnaparkhi
> >>>> parser that opennlp currently uses), and may be superior for cross-
> language performance.
> >>>>
> >>>> I am going to play with it with some of our clinical data to get a
> >>>> feel for speed/accuracy on clinical text. Just curious if there is
> >>>> any interest in a wrapper for this parser in opennlp?
> >>> I don't think a wrapper is interesting for us. If people want to use
> >>> this parser it is probably better if they integrate it directly or use a
> component framework like UIMA or GATE.
> >>>
> >>> Anyway, getting a new parser as a contribution would be interesting.
> >>>
> >>> Jörn

Re: interest in new parser?

Posted by Jörn Kottmann <ko...@gmail.com>.

+1, it would be possible to include different styles and
implementations of parsers in OpenNLP.

Jörn

On 06/23/2014 01:12 PM, Rodrigo Agerri wrote:
> Hello,
>
> Ratnapharki's (1999) is a shift-reduced parser. Others like Stanford
> NLP are now releasing shift-reduced parsers. There are differences
> between them, though. For example, Zhang and Clark (2009)'s parser
> (cited by Stanford's new parser) is similar except that they use a
> global discriminative model applying Collins (2002) perceptron,
> whereas
> Ratnaparkhi’s parser has separate probabilities of actions chained
> together in a conditional model (based on ME).
>
> Perhaps that route, among others, would be an interesting one to have
> a new parser in opennlp.
>
> Cheers,
>
> Rodrigo
>
>
>
> On Sun, Jun 22, 2014 at 9:44 PM, Richard Eckart de Castilho
> <ri...@gmail.com> wrote:
>> Some time ago I asked the mstparser developers if they would consider
>> contributing the parser to OpenNLP. They said that mstparser isn't
>> up-to-date anymore since better parsers are now available, but in
>> principle didn't reject the idea.
>>
>> If OpenNLP was interested in adopting the mstparser, that might
>> be something to follow up on.
>>
>> The mstparser is only a dependency parser, not a constituency parser
>> as the one currently included with OpenNLP.
>>
>> Cheers,
>>
>> -- Richard
>>
>> On 20.06.2014, at 09:33, Jörn Kottmann <ko...@gmail.com> wrote:
>>
>>> On 06/19/2014 06:00 PM, Miller, Timothy wrote:
>>>> There is a paper at this year's ACL conference on a statistical parser
>>>> with some interesting properties [1]. I tracked down the software [2]
>>>> and it is apache-licensed (unlike most other high quality parsers such
>>>> as the Berkeley and Stanford parsers). It is written in Scala so in
>>>> theory it should be compatible. Most importantly it is about as accurate
>>>> as those state of the art parsers on English (about 33% error reduction
>>>> from the Ratnaparkhi parser that opennlp currently uses), and may be
>>>> superior for cross-language performance.
>>>>
>>>> I am going to play with it with some of our clinical data to get a feel
>>>> for speed/accuracy on clinical text. Just curious if there is any
>>>> interest in a wrapper for this parser in opennlp?
>>> I don't think a wrapper is interesting for us. If people want to use this parser it is probably
>>> better if they integrate it directly or use a component framework like UIMA or GATE.
>>>
>>> Anyway, getting a new parser as a contribution would be interesting.
>>>
>>> Jörn

Re: interest in new parser?

Posted by Rodrigo Agerri <ra...@apache.org>.

Hello,

Ratnapharki's (1999) is a shift-reduced parser. Others like Stanford
NLP are now releasing shift-reduced parsers. There are differences
between them, though. For example, Zhang and Clark (2009)'s parser
(cited by Stanford's new parser) is similar except that they use a
global discriminative model applying Collins (2002) perceptron,
whereas
Ratnaparkhi’s parser has separate probabilities of actions chained
together in a conditional model (based on ME).

Perhaps that route, among others, would be an interesting one to have
a new parser in opennlp.

Cheers,

Rodrigo



On Sun, Jun 22, 2014 at 9:44 PM, Richard Eckart de Castilho
<ri...@gmail.com> wrote:
> Some time ago I asked the mstparser developers if they would consider
> contributing the parser to OpenNLP. They said that mstparser isn't
> up-to-date anymore since better parsers are now available, but in
> principle didn't reject the idea.
>
> If OpenNLP was interested in adopting the mstparser, that might
> be something to follow up on.
>
> The mstparser is only a dependency parser, not a constituency parser
> as the one currently included with OpenNLP.
>
> Cheers,
>
> -- Richard
>
> On 20.06.2014, at 09:33, Jörn Kottmann <ko...@gmail.com> wrote:
>
>> On 06/19/2014 06:00 PM, Miller, Timothy wrote:
>>> There is a paper at this year's ACL conference on a statistical parser
>>> with some interesting properties [1]. I tracked down the software [2]
>>> and it is apache-licensed (unlike most other high quality parsers such
>>> as the Berkeley and Stanford parsers). It is written in Scala so in
>>> theory it should be compatible. Most importantly it is about as accurate
>>> as those state of the art parsers on English (about 33% error reduction
>>> from the Ratnaparkhi parser that opennlp currently uses), and may be
>>> superior for cross-language performance.
>>>
>>> I am going to play with it with some of our clinical data to get a feel
>>> for speed/accuracy on clinical text. Just curious if there is any
>>> interest in a wrapper for this parser in opennlp?
>>
>> I don't think a wrapper is interesting for us. If people want to use this parser it is probably
>> better if they integrate it directly or use a component framework like UIMA or GATE.
>>
>> Anyway, getting a new parser as a contribution would be interesting.
>>
>> Jörn
>

Re: interest in new parser?

Posted by Richard Eckart de Castilho <ri...@gmail.com>.

Some time ago I asked the mstparser developers if they would consider
contributing the parser to OpenNLP. They said that mstparser isn't
up-to-date anymore since better parsers are now available, but in
principle didn't reject the idea.

If OpenNLP was interested in adopting the mstparser, that might
be something to follow up on.

The mstparser is only a dependency parser, not a constituency parser
as the one currently included with OpenNLP.

Cheers,

-- Richard

On 20.06.2014, at 09:33, Jörn Kottmann <ko...@gmail.com> wrote:

> On 06/19/2014 06:00 PM, Miller, Timothy wrote:
>> There is a paper at this year's ACL conference on a statistical parser
>> with some interesting properties [1]. I tracked down the software [2]
>> and it is apache-licensed (unlike most other high quality parsers such
>> as the Berkeley and Stanford parsers). It is written in Scala so in
>> theory it should be compatible. Most importantly it is about as accurate
>> as those state of the art parsers on English (about 33% error reduction
>> from the Ratnaparkhi parser that opennlp currently uses), and may be
>> superior for cross-language performance.
>> 
>> I am going to play with it with some of our clinical data to get a feel
>> for speed/accuracy on clinical text. Just curious if there is any
>> interest in a wrapper for this parser in opennlp?
> 
> I don't think a wrapper is interesting for us. If people want to use this parser it is probably
> better if they integrate it directly or use a component framework like UIMA or GATE.
> 
> Anyway, getting a new parser as a contribution would be interesting.
> 
> Jörn

Re: interest in new parser?

Posted by Jörn Kottmann <ko...@gmail.com>.

On 06/19/2014 06:00 PM, Miller, Timothy wrote:
> There is a paper at this year's ACL conference on a statistical parser
> with some interesting properties [1]. I tracked down the software [2]
> and it is apache-licensed (unlike most other high quality parsers such
> as the Berkeley and Stanford parsers). It is written in Scala so in
> theory it should be compatible. Most importantly it is about as accurate
> as those state of the art parsers on English (about 33% error reduction
> from the Ratnaparkhi parser that opennlp currently uses), and may be
> superior for cross-language performance.
>
> I am going to play with it with some of our clinical data to get a feel
> for speed/accuracy on clinical text. Just curious if there is any
> interest in a wrapper for this parser in opennlp?

I don't think a wrapper is interesting for us. If people want to use 
this parser it is probably
better if they integrate it directly or use a component framework like 
UIMA or GATE.

Anyway, getting a new parser as a contribution would be interesting.

Jörn