You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@opennlp.apache.org by Benedict Holland <be...@gmail.com> on 2017/10/06 15:26:01 UTC
Name training data sentences
Hello all,
I am working on getting together a file with a list of tokenized sentences.
I have a quick question:
Can name training data contain sentences without any tags?
For example, if I had a sentence like
<START:person> Molly <END> enjoys pancakes in the morning .
She does not enjoy being woken up at 4:30 by her cat .
Does the second sentence provide any additional benefit to the ME model?
The answer to this question should probably be in the documentation.
Thanks,
~Ben
Re: Name training data sentences
Posted by Joern Kottmann <ko...@gmail.com>.
Please send us PR to update the documentation. We are always happy
about new contributors.
We have an annotation service which can be called from BRAT. There are
no tools to produce brat output otherwise.
Jörn
On Fri, Oct 6, 2017 at 11:42 PM, Benedict Holland
<be...@gmail.com> wrote:
> Hi All,
>
> That is exactly what I assumed but that information isn't included in the
> documentation.
>
> Also, if there a way to integrate the name finder with brat annotation to
> create brat annotated files using a name finder, that would superb to put
> within the documentation as well.
>
> BTW, this circular analysis is genuinely amazing. I am incredibly
> impressed.
>
> Thanks,
> ~Ben
>
> On Fri, Oct 6, 2017 at 4:13 PM, Gary Underwood <gu...@clinacuity.com>
> wrote:
>
>> I like to think of it as providing examples of what is NOT what you are
>> wanting tags.
>> Gary Underwood
>> gunderwood@clinacuity.com
>>
>>
>>
>> > On Oct 6, 2017, at 3:50 PM, Joern Kottmann <ko...@gmail.com> wrote:
>> >
>> > It is like Daniel says and it is good to have training data that is
>> > close to the data you intend to process with the model.
>> >
>> > Jörn
>> >
>> > On Fri, Oct 6, 2017 at 5:32 PM, Dan Russ <da...@gmail.com> wrote:
>> >> I believe it does. Every word is classified as “begin”, “inside”, or
>> “outside” - BIO encoding, so an event is generated for “she” and then
>> “does” and then “not” — all of which is classified as “outside”.
>> >>
>> >> Anyone smarter have a comment on this???
>> >> Daniel
>> >>
>> >>
>> >>> On Oct 6, 2017, at 11:26 AM, Benedict Holland <
>> benedict.m.holland@gmail.com> wrote:
>> >>>
>> >>> Hello all,
>> >>>
>> >>> I am working on getting together a file with a list of tokenized
>> sentences.
>> >>> I have a quick question:
>> >>>
>> >>> Can name training data contain sentences without any tags?
>> >>>
>> >>> For example, if I had a sentence like
>> >>>
>> >>> <START:person> Molly <END> enjoys pancakes in the morning .
>> >>> She does not enjoy being woken up at 4:30 by her cat .
>> >>>
>> >>> Does the second sentence provide any additional benefit to the ME
>> model?
>> >>> The answer to this question should probably be in the documentation.
>> >>>
>> >>> Thanks,
>> >>> ~Ben
>> >>
>>
>>
Re: Name training data sentences
Posted by Benedict Holland <be...@gmail.com>.
Hi All,
That is exactly what I assumed but that information isn't included in the
documentation.
Also, if there a way to integrate the name finder with brat annotation to
create brat annotated files using a name finder, that would superb to put
within the documentation as well.
BTW, this circular analysis is genuinely amazing. I am incredibly
impressed.
Thanks,
~Ben
On Fri, Oct 6, 2017 at 4:13 PM, Gary Underwood <gu...@clinacuity.com>
wrote:
> I like to think of it as providing examples of what is NOT what you are
> wanting tags.
> Gary Underwood
> gunderwood@clinacuity.com
>
>
>
> > On Oct 6, 2017, at 3:50 PM, Joern Kottmann <ko...@gmail.com> wrote:
> >
> > It is like Daniel says and it is good to have training data that is
> > close to the data you intend to process with the model.
> >
> > Jörn
> >
> > On Fri, Oct 6, 2017 at 5:32 PM, Dan Russ <da...@gmail.com> wrote:
> >> I believe it does. Every word is classified as “begin”, “inside”, or
> “outside” - BIO encoding, so an event is generated for “she” and then
> “does” and then “not” — all of which is classified as “outside”.
> >>
> >> Anyone smarter have a comment on this???
> >> Daniel
> >>
> >>
> >>> On Oct 6, 2017, at 11:26 AM, Benedict Holland <
> benedict.m.holland@gmail.com> wrote:
> >>>
> >>> Hello all,
> >>>
> >>> I am working on getting together a file with a list of tokenized
> sentences.
> >>> I have a quick question:
> >>>
> >>> Can name training data contain sentences without any tags?
> >>>
> >>> For example, if I had a sentence like
> >>>
> >>> <START:person> Molly <END> enjoys pancakes in the morning .
> >>> She does not enjoy being woken up at 4:30 by her cat .
> >>>
> >>> Does the second sentence provide any additional benefit to the ME
> model?
> >>> The answer to this question should probably be in the documentation.
> >>>
> >>> Thanks,
> >>> ~Ben
> >>
>
>
Re: Name training data sentences
Posted by Gary Underwood <gu...@clinacuity.com>.
I like to think of it as providing examples of what is NOT what you are wanting tags.
Gary Underwood
gunderwood@clinacuity.com
> On Oct 6, 2017, at 3:50 PM, Joern Kottmann <ko...@gmail.com> wrote:
>
> It is like Daniel says and it is good to have training data that is
> close to the data you intend to process with the model.
>
> Jörn
>
> On Fri, Oct 6, 2017 at 5:32 PM, Dan Russ <da...@gmail.com> wrote:
>> I believe it does. Every word is classified as “begin”, “inside”, or “outside” - BIO encoding, so an event is generated for “she” and then “does” and then “not” — all of which is classified as “outside”.
>>
>> Anyone smarter have a comment on this???
>> Daniel
>>
>>
>>> On Oct 6, 2017, at 11:26 AM, Benedict Holland <be...@gmail.com> wrote:
>>>
>>> Hello all,
>>>
>>> I am working on getting together a file with a list of tokenized sentences.
>>> I have a quick question:
>>>
>>> Can name training data contain sentences without any tags?
>>>
>>> For example, if I had a sentence like
>>>
>>> <START:person> Molly <END> enjoys pancakes in the morning .
>>> She does not enjoy being woken up at 4:30 by her cat .
>>>
>>> Does the second sentence provide any additional benefit to the ME model?
>>> The answer to this question should probably be in the documentation.
>>>
>>> Thanks,
>>> ~Ben
>>
Re: Name training data sentences
Posted by Joern Kottmann <ko...@gmail.com>.
It is like Daniel says and it is good to have training data that is
close to the data you intend to process with the model.
Jörn
On Fri, Oct 6, 2017 at 5:32 PM, Dan Russ <da...@gmail.com> wrote:
> I believe it does. Every word is classified as “begin”, “inside”, or “outside” - BIO encoding, so an event is generated for “she” and then “does” and then “not” — all of which is classified as “outside”.
>
> Anyone smarter have a comment on this???
> Daniel
>
>
>> On Oct 6, 2017, at 11:26 AM, Benedict Holland <be...@gmail.com> wrote:
>>
>> Hello all,
>>
>> I am working on getting together a file with a list of tokenized sentences.
>> I have a quick question:
>>
>> Can name training data contain sentences without any tags?
>>
>> For example, if I had a sentence like
>>
>> <START:person> Molly <END> enjoys pancakes in the morning .
>> She does not enjoy being woken up at 4:30 by her cat .
>>
>> Does the second sentence provide any additional benefit to the ME model?
>> The answer to this question should probably be in the documentation.
>>
>> Thanks,
>> ~Ben
>
Re: Name training data sentences
Posted by Dan Russ <da...@gmail.com>.
I believe it does. Every word is classified as “begin”, “inside”, or “outside” - BIO encoding, so an event is generated for “she” and then “does” and then “not” — all of which is classified as “outside”.
Anyone smarter have a comment on this???
Daniel
> On Oct 6, 2017, at 11:26 AM, Benedict Holland <be...@gmail.com> wrote:
>
> Hello all,
>
> I am working on getting together a file with a list of tokenized sentences.
> I have a quick question:
>
> Can name training data contain sentences without any tags?
>
> For example, if I had a sentence like
>
> <START:person> Molly <END> enjoys pancakes in the morning .
> She does not enjoy being woken up at 4:30 by her cat .
>
> Does the second sentence provide any additional benefit to the ME model?
> The answer to this question should probably be in the documentation.
>
> Thanks,
> ~Ben