You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@opennlp.apache.org by Svetoslav Marinov <sv...@findwise.com> on 2013/04/25 15:15:09 UTC

Re: coref results seem weird

What corpus has the English Coref module been trained on?

Can someone provide some guidance into which language specific resources
(modulo Sentence splitters, tokenizers, POS tagger, Parser and NER) are
needed in order to get the coreference working for a new language. A
Wordnet? What else?

Thank you in advance!

Best,
Svetoslav

On 2013-03-06 14:15, "Jim - FooBar();" <ji...@gmail.com> wrote:

>right, finally the moment of truth! here is what I get using your loop:
>
>     Mention set:: [  this British industrial conglomerate   ]
>     Mention set:: [  a director   ]
>     Mention set:: [  Consolidated Gold Fields PLC   ]
>     Mention set:: [  chairman  :: former chairman   ]
>     Mention set:: [  55 years   ]
>     Mention set:: [  Rudolph Agnew   ]
>     Mention set:: [  Elsevier N . V .  :: the Dutch publishing group   ]
>     Mention set:: [  Mr . Vinken   ]
>     Mention set:: [  a nonexecutive director Nov . 29   ]
>     Mention set:: [  the board   ]
>     Mention set:: [  61 years   ]
>     Mention set:: [  Pierre Vinken   ]
>
>As you can see I am missing these 2 which do seem correct (in your
>output):
>
>  Mention set:: [ Pierre Vinken  :: Mr. Vinken  ]
>  Mention set:: [ a nonexecutive director  :: chairman  :: former
>chairman  :: a director  ]
>
>Jim
>
>ps: I can confirm that the NEs can be retrieved correctly from the
>parse-tree just like yours...
>
>
>
>On 06/03/13 12:19, Jim - FooBar(); wrote:
>> ok found your code and it answers all my questions!....I'll do the
>> same now and see what happens... :)
>>
>> Jim
>>
>>
>> On 06/03/13 12:01, Jim - FooBar(); wrote:
>>> I'm sorry I forgot another thing...how do you ask for the NEs from
>>> the Mention sets? I can only ask for the NEs from the Parse object...
>>>
>>> Jim
>>>
>>>
>>> On 06/03/13 11:58, Jim - FooBar(); wrote:
>>>> Hi there,
>>>>
>>>> I apologise for the late reply but I've been a bit ill the past few
>>>> days...
>>>> So I've got some good news and some bad news...let me explain:
>>>>
>>>> here are my parses for each sentence (notice how they are identical
>>>> to yours -> GOOD news!):
>>>> -------------------------------------
>>>> (TOP (S (NP (person (NP (NNP Pierre) (NNP Vinken)) (, ,)) (ADJP (NP
>>>> (CD 61) (NNS years)) (JJ old))) (, ,) (VP (MD will) (VP (VB join)
>>>> (NP (DT the) (NN board)) (PP (IN as) (NP (DT a) (JJ nonexecutive)
>>>> (NN director) (NNP Nov) (NNP .) (CD 29))))) (. .)))
>>>>
>>>> (TOP (S (NP (NNP Mr) (. .) (NNP Vinken)) (VP (VBZ is) (NP (NP (NN
>>>> chairman)) (PP (IN of) (NP (NP (NNP Elsevier) (NNP N) (NNP .) (NNP
>>>> V) (NNP .)) (, ,) (NP (DT the) (JJ Dutch) (NN publishing) (NN
>>>> group)))))) (. .)))
>>>>
>>>> (TOP (NP (person (NP (NNP Rudolph) (NNP Agnew)) (, ,)) (UCP (ADJP
>>>> (NP (CD 55) (NNS years)) (JJ old)) (CC and) (S (NP (NP (JJ former)
>>>> (NN chairman)) (PP (IN of) (NP (NNP Consolidated) (NNP Gold) (NNP
>>>> Fields) (NNP PLC)))) (, ,) (VP (VBD was) (VP (VBN named) (S (NP (NP
>>>> (DT a) (NN director)) (PP (IN of) (NP (DT this) (JJ British) (JJ
>>>> industrial) (NN conglomerate))))))))) (. .)))
>>>> --------------------------------
>>>>
>>>> Doing this right is more than half the story for the
>>>> coref-linker...Now, though there is a slight problem. What exactly
>>>> is a Mention set in your output? How come you're not getting an
>>>> array of DiscourseEntities back? In addition, are you filtering the
>>>> resulting array for entities with size more than 1?
>>>>
>>>> I have to say, your output does seem correct from a coreference
>>>> resolution perspective...the problem is I can't understand why we're
>>>> getting different results...If you could explain what is a
>>>> MentionSet that would be great...
>>>>
>>>> thanks again,
>>>>
>>>> Jim
>>>>
>>>>
>>>>
>>>> On 04/03/13 05:29, Ant B wrote:
>>>>> Hi Jim,
>>>>>
>>>>> No problem - a good excuse to tidy the code.  I added a few
>>>>> println() calls to display input text, sentence parse objects,
>>>>> coreference mention sets and named entities in those sets. Note
>>>>> that I only added "person" NERs to the sentence parse.
>>>>>
>>>>>
>>>>> <start of code output>
>>>>>
>>>>> Input sentences::
>>>>> Pierre Vinken, 61 years old, will join the board as a nonexecutive
>>>>> director Nov. 29. Mr. Vinken is chairman of Elsevier N.V., the
>>>>> Dutch publishing group. Rudolph Agnew, 55 years old and former
>>>>> chairman of Consolidated Gold Fields PLC, was named a director of
>>>>> this British industrial conglomerate.
>>>>>
>>>>> Sentence#1 parse after POS & NER tag:
>>>>> (TOP (S (NP (person (NP (NNP Pierre) (NNP Vinken))(, ,)) (ADJP (NP
>>>>> (CD 61) (NNS years)) (JJ old)))(, ,) (VP (MD will) (VP (VB join)
>>>>> (NP (DT the) (NN board)) (PP (IN as) (NP (NP (DT a) (JJ
>>>>> nonexecutive) (NN director)) (NP (NNP Nov.) (CD 29))))))(. .)))
>>>>>
>>>>> Sentence#2 parse after POS & NER tag:
>>>>> (TOP (S (NP (NNP Mr.) (NNP Vinken)) (VP (VBZ is) (NP (NP (NN
>>>>> chairman)) (PP (IN of) (NP (NP (NNP Elsevier) (NNP N.V.))(, ,) (NP
>>>>> (DT the) (JJ Dutch) (NN publishing) (NN group))))))(. .)))
>>>>>
>>>>> Sentence#3 parse after POS & NER tag:
>>>>> (TOP (NP (person (NP (NNP Rudolph) (NNP Agnew))(, ,)) (UCP (ADJP
>>>>> (NP (CD 55) (NNS years)) (JJ old)) (CC and) (S (NP (NP (JJ former)
>>>>> (NN chairman)) (PP (IN of) (NP (NNP Consolidated) (NNP Gold) (NNP
>>>>> Fields) (NNP PLC))))(, ,) (VP (VBD was) (VP (VBN named) (S (NP (NP
>>>>> (DT a) (NN director)) (PP (IN of) (NP (DT this) (JJ British) (JJ
>>>>> industrial) (NN conglomerate)))))))))(. .)))
>>>>>
>>>>> Now displaying all discourse entities::
>>>>>     Mention set:: [ this British industrial conglomerate  ]
>>>>>     Mention set:: [ a nonexecutive director  :: chairman  :: former
>>>>> chairman  :: a director  ]
>>>>>     Mention set:: [ Consolidated Gold Fields PLC  ]
>>>>>     Mention set:: [ 55 years  ]
>>>>>     Mention set:: [ Rudolph Agnew  ]
>>>>>     Mention set:: [ Elsevier N.V.  :: the Dutch publishing group  ]
>>>>>     Mention set:: [ Pierre Vinken  :: Mr. Vinken  ]
>>>>>     Mention set:: [ Nov. 29  ]
>>>>>     Mention set:: [ the board  ]
>>>>>     Mention set:: [ 61 years  ]
>>>>>
>>>>>
>>>>> Now printing out the named entities from mention sets::
>>>>>     [Rudolph Agnew ]
>>>>>     [Pierre Vinken ]
>>>>>
>>>>> <end of code output>
>>>>>
>>>>>
>>>>> I do not know for certain that my code is correct, so thanks for
>>>>> the chance to compare data.  I think this matches up with your
>>>>> results.
>>>>>
>>>>> Let me know if you want to hack around any further - I'd really
>>>>> like a correct, reviewed & validated coreference example. I'm sure
>>>>> others would benefit from such an example too.
>>>>>
>>>>>
>>>>> Ant
>>>>>
>>>>> On Mar 3, 2013, at 12:37 PM, Jim - FooBar(); <ji...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> do you still happen to have that project locally in your hard
>>>>>> drive? is there any chance you could run this and post the
>>>>>> results? Alternatively, I'd have to clone the repo and give it a
>>>>>> spin...I imagine it would be a lot easier for you as you are the
>>>>>> author...
>>>>>>
>>>>>> let me know if you can't do that...I noticed in your code that you
>>>>>> already use the sentence i am interested in which is good.... :)
>>>>>>
>>>>>> Jim
>>>>>>
>>>>>>
>>>>>> On 03/03/13 03:29, amb.enthusiast@gmail.com wrote:
>>>>>>> Hey guys,
>>>>>>>
>>>>>>> You may have already seen this, but I posted a really simple
>>>>>>> "getting started" Eclipse project on my GitHub profile for the
>>>>>>> coreference API:
>>>>>>> https://github.com/amb-enthusiast/CoreferenceTest
>>>>>>>
>>>>>>> It is similar to what you have already explored, but may provide
>>>>>>> some validation.
>>>>>>>
>>>>>>>
>>>>>>> I recently experimented using NER results from Stanford CoreNLP
>>>>>>> in sentence parse objects.  In cases where better named entity
>>>>>>> info is added to the sentence parse, coref performance improves.
>>>>>>>
>>>>>>> Hope this helps in some way.
>>>>>>>
>>>>>>>
>>>>>>> Ant
>>>>>>>
>>>>>>> ----- Reply message -----
>>>>>>> From: "Jim - FooBar();" <ji...@gmail.com>
>>>>>>> To: <us...@opennlp.apache.org>
>>>>>>> Subject: coref results seem weird
>>>>>>> Date: Sat, Mar 2, 2013 9:10 am
>>>>>>>
>>>>>>>
>>>>>>> see this old message by Jorn... check the next message message in
>>>>>>> the
>>>>>>> thread as well. at some point he posts detailed code...it might be
>>>>>>> easier to put together a dummy project in order to use the API
>>>>>>> which is
>>>>>>> more flexible...
>>>>>>>
>>>>>>> Jim
>>>>>>>
>>>>>>> 
>>>>>>>http://mail-archives.apache.org/mod_mbox/opennlp-users/201112.mbox/%
>>>>>>>3C4ED76AF3.8020503@gmail.com%3E
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 02/03/13 16:05, James Kosin wrote:
>>>>>>>> I'm trying to use the CLI.  I have other code in another project
>>>>>>>> that
>>>>>>>> loads the dictionary properly using the property file to specify
>>>>>>>> the
>>>>>>>> information.  Coreference does it differently...
>>>>>>>>
>>>>>>>> On 3/2/2013 6:32 AM, Jim - FooBar(); wrote:
>>>>>>>>> I should be able to help you...are you going through the cli or
>>>>>>>>> the
>>>>>>>>> API? I bet there is something wrong with the Wordnet directory
>>>>>>>>> you're
>>>>>>>>> passing...I had similar issues...
>>>>>>>>> If you're using the API let me know and I'll send you a code
>>>>>>>>> snippet
>>>>>>>>> that may help...
>>>>>>>>>
>>>>>>>>> Jim
>>>>>>>>>
>>>>>>>>> On 02/03/13 05:17, James Kosin wrote:
>>>>>>>>>> Jim,
>>>>>>>>>>
>>>>>>>>>> I can't seem to get past the NULL pointer exceptions when
>>>>>>>>>> Coreferencer is trying to load the dictionaries. So, this will
>>>>>>>>>>be
>>>>>>>>>> much later now.  I'm going to sleep and play tooth fairy.
>>>>>>>>>>
>>>>>>>>>> James
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 3/1/2013 8:18 AM, Jim - FooBar(); wrote:
>>>>>>>>>>> Like you, I'm using the latest WOrdnet and JWNL (1.4 RC_3 is on
>>>>>>>>>>> maven you don't need to build it from source)....
>>>>>>>>>>> Now that you've set up your end could you please perform a
>>>>>>>>>>> run on
>>>>>>>>>>> the standard example sentence? In addition could you try to
>>>>>>>>>>> add the
>>>>>>>>>>> named-entities to the parse-tree?
>>>>>>>>>>> If yes, please post your results here for comparison with mine?
>>>>>>>>>>>
>>>>>>>>>>> thanks a lot,
>>>>>>>>>>>
>>>>>>>>>>> Jim
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 01/03/13 02:21, James Kosin wrote:
>>>>>>>>>>>> Hi Jim,
>>>>>>>>>>>>
>>>>>>>>>>>> What version of the JWNL and WordNet dictionaries are you
>>>>>>>>>>>> using?
>>>>>>>>>>>> I never got much more than researching what it is used for,
>>>>>>>>>>>>and
>>>>>>>>>>>> its importance to handling the task.
>>>>>>>>>>>>
>>>>>>>>>>>> I've just updated my end for the 3.1 WordNet dictionaries.
>>>>>>>>>>>>But,
>>>>>>>>>>>> I'm also using 1.4_rc3 from sources to build JWNL. The extJWNL
>>>>>>>>>>>> seems to be more apt to handling more types of dictionaries
>>>>>>>>>>>> (supporting UTF-8 and others), and actually creating and
>>>>>>>>>>>> modifying
>>>>>>>>>>>> them as well; which isn't needed when we are really only
>>>>>>>>>>>> wanting
>>>>>>>>>>>> read usage.
>>>>>>>>>>>>
>>>>>>>>>>>> James
>>>>>>>>>>>>
>>>>>>>>>>>> On 2/28/2013 4:49 AM, Jim foo.bar wrote:
>>>>>>>>>>>>> Hi James,
>>>>>>>>>>>>>
>>>>>>>>>>>>> thanks for your reply and your comments but that is not quite
>>>>>>>>>>>>> what I asked...I've looked at all the web resources related
>>>>>>>>>>>>>to
>>>>>>>>>>>>> the opennlp coref component, otherwise I would never have
>>>>>>>>>>>>> gotten
>>>>>>>>>>>>> it to work!
>>>>>>>>>>>>>
>>>>>>>>>>>>> My problem is about the results it brings back, in
>>>>>>>>>>>>> particular I'd
>>>>>>>>>>>>> like to compare my produced discourse entities with someone
>>>>>>>>>>>>> else's  on the same piece of text. Since I'm working on a
>>>>>>>>>>>>> language other than Java, that would confirm that my code
>>>>>>>>>>>>> is at
>>>>>>>>>>>>> least correct. On a secondary note, I'd like to see how to
>>>>>>>>>>>>> insert
>>>>>>>>>>>>> the named-entities into the parse tree before deploying the
>>>>>>>>>>>>> TrreBankLinker. I followed the instructions posted my Jorn
>>>>>>>>>>>>> sometime last year but I 'm not sure how the output should
>>>>>>>>>>>>> look
>>>>>>>>>>>>> like .That is why I posted what I'm getting...Can you see any
>>>>>>>>>>>>> 'person' named-entities in my DicourseEntities?
>>>>>>>>>>>>>
>>>>>>>>>>>>> More importantly, if you run the coref component on the
>>>>>>>>>>>>> standard
>>>>>>>>>>>>> example sentence (Pierre Vinken, ...) what do you get?
>>>>>>>>>>>>> Could you
>>>>>>>>>>>>> post the exact output?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Whoever psoted this:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>http://blog.dpdearing.com/2012/11/making-coreference-resolutio
>>>>>>>>>>>>>n-with-opennlp-1-5-0-your-bitch/
>>>>>>>>>>>>>
>>>>>>>>>>>>> did not try to insert any NEs into the parse tree. In
>>>>>>>>>>>>> addition,
>>>>>>>>>>>>> his output is slightly different than mine...I don't know
>>>>>>>>>>>>> if that
>>>>>>>>>>>>> is because of a newer version of JWNL.jar that I'm using or
>>>>>>>>>>>>> something else...
>>>>>>>>>>>>>
>>>>>>>>>>>>> Jim
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 28/02/13 02:51, James Kosin wrote:
>>>>>>>>>>>>>> Jim,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Here is a place to start, with maybe some more examples:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>http://stackoverflow.com/questions/8629737/coreference-resolu
>>>>>>>>>>>>>>tion-using-opennlp
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> James
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 2/27/2013 1:26 PM, Jim - FooBar(); wrote:
>>>>>>>>>>>>>>> Hmmm.... interesting! When I run it on these 2 simple
>>>>>>>>>>>>>>> sentences:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> /"Mary likes pizza but she also likes kebabs. Knowing
>>>>>>>>>>>>>>> her, I'd
>>>>>>>>>>>>>>> give it 2 weeks before she turns massive!"/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I get perfect results!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> #<DiscourseEntity [ Mary, she, her, she ]>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> this demonstrates 3 things:
>>>>>>>>>>>>>>> - my understanding of coref is indeed correct
>>>>>>>>>>>>>>> - the coref component can link entities from separate
>>>>>>>>>>>>>>> sentences
>>>>>>>>>>>>>>> - possibly that my code is fine
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> any thoughts?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Jim
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 27/02/13 18:14, Jim - FooBar(); wrote:
>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I finally managed to get coref working (phew!-my god
>>>>>>>>>>>>>>>> that was
>>>>>>>>>>>>>>>> tricky) but I'm slightly confused with the results so
>>>>>>>>>>>>>>>> I'd like
>>>>>>>>>>>>>>>> to see if anyone else has tried that out...Using the
>>>>>>>>>>>>>>>> standard
>>>>>>>>>>>>>>>> paragraph used in the other examples:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> /"Pierre Vinken, 61 years old, will join the board as a
>>>>>>>>>>>>>>>> nonexecutive director Nov. 29. Mr. Vinken is chairman of
>>>>>>>>>>>>>>>> Elsevier N.V., the Dutch publishing group. Rudolph
>>>>>>>>>>>>>>>> Agnew, 55
>>>>>>>>>>>>>>>> years old and former chairman of Consolidated Gold
>>>>>>>>>>>>>>>> Fields PLC,
>>>>>>>>>>>>>>>> was named a director of this British industrial
>>>>>>>>>>>>>>>> conglomerate."/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> deploying the coref component gives me the following:
>>>>>>>>>>>>>>>> I must note that I'm trying to pass the named entities
>>>>>>>>>>>>>>>> as well
>>>>>>>>>>>>>>>> (person). I've confirmed that the spans are correctly
>>>>>>>>>>>>>>>> identitified (3 spans for this particular example) and
>>>>>>>>>>>>>>>> added
>>>>>>>>>>>>>>>> to the parse tree via
>>>>>>>>>>>>>>>> /opennlp.tools.parser.Parse.addNames//("person", span,
>>>>>>>>>>>>>>>> parse.getTagNodes());/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> [#<DiscourseEntity [ this British industrial
>>>>>>>>>>>>>>>> conglomerate ]>,
>>>>>>>>>>>>>>>>   #<DiscourseEntity [ a director of this British
>>>>>>>>>>>>>>>>industrial
>>>>>>>>>>>>>>>> conglomerate ]>,
>>>>>>>>>>>>>>>>   #<DiscourseEntity [ Consolidated Gold Fields PLC ]>,
>>>>>>>>>>>>>>>>   #<DiscourseEntity [ chairman of Elsevier N . V . , the
>>>>>>>>>>>>>>>> Dutch
>>>>>>>>>>>>>>>> publishing group, former chairman of Consolidated Gold
>>>>>>>>>>>>>>>> Fields
>>>>>>>>>>>>>>>> PLC ]>,
>>>>>>>>>>>>>>>>   #<DiscourseEntity [ 55 years ]>,
>>>>>>>>>>>>>>>>   #<DiscourseEntity [ Rudolph Agnew , 55 years old and
>>>>>>>>>>>>>>>> former
>>>>>>>>>>>>>>>> chairman of Consolidated Gold Fields PLC , was named a
>>>>>>>>>>>>>>>> director of this British industrial conglomerate . ]>,
>>>>>>>>>>>>>>>>   #<DiscourseEntity [ Elsevier N . V . , the Dutch
>>>>>>>>>>>>>>>> publishing
>>>>>>>>>>>>>>>> group, the Dutch publishing group ]>,
>>>>>>>>>>>>>>>>   #<DiscourseEntity [ Mr . Vinken ]>,
>>>>>>>>>>>>>>>>   #<DiscourseEntity [ a nonexecutive director Nov . 29 ]>,
>>>>>>>>>>>>>>>>   #<DiscourseEntity [ the board ]>,
>>>>>>>>>>>>>>>>   #<DiscourseEntity [ 61 years ]>,
>>>>>>>>>>>>>>>>   #<DiscourseEntity [ Pierre Vinken , 61 years old ]>
>>>>>>>>>>>>>>>> ]
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> *filtering for more than 1 mentions (per Jorn's
>>>>>>>>>>>>>>>>suggestion)
>>>>>>>>>>>>>>>> gives back:*
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> [#<DiscourseEntity [ chairman of Elsevier N . V . , the
>>>>>>>>>>>>>>>> Dutch
>>>>>>>>>>>>>>>> publishing group, former chairman of Consolidated Gold
>>>>>>>>>>>>>>>> Fields
>>>>>>>>>>>>>>>> PLC ]>
>>>>>>>>>>>>>>>>   #<DiscourseEntity [ Elsevier N . V . , the Dutch
>>>>>>>>>>>>>>>> publishing
>>>>>>>>>>>>>>>> group, the Dutch publishing group ]>
>>>>>>>>>>>>>>>> ]
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Assuming that this is what it's supposed to output, can
>>>>>>>>>>>>>>>> someone explain this? First of all where are the
>>>>>>>>>>>>>>>> named-entities? Secondly, out of the 2 filtered
>>>>>>>>>>>>>>>> DiscourseEntities, both seem plain wrong! Moreover,
>>>>>>>>>>>>>>>> where is
>>>>>>>>>>>>>>>> #<DiscourseEntity [Rudolph Agnew, //former chairman of
>>>>>>>>>>>>>>>> Consolidated Gold Fields PLC/,/ the Dutch publishing
>>>>>>>>>>>>>>>>group,
>>>>>>>>>>>>>>>> director of this British industrial conglomerate ]> ???
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Either I'm not understanding coreference, or I've coded
>>>>>>>>>>>>>>>>the
>>>>>>>>>>>>>>>> thing wrong or the models is not very good! Which one is
>>>>>>>>>>>>>>>> it?
>>>>>>>>>>>>>>>> Has anyone else attempted this? Can we compare results
>>>>>>>>>>>>>>>> on this
>>>>>>>>>>>>>>>> particular sentence?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> thanks in advance :)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Jim
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ps: my code is in Clojure but it is based on a code
>>>>>>>>>>>>>>>>snippet
>>>>>>>>>>>>>>>> provided by Jorn to someone on the mailing list last
>>>>>>>>>>>>>>>> year . I
>>>>>>>>>>>>>>>> can easily provide it but I don't think it will be of much
>>>>>>>>>>>>>>>> help...
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>
>>>>
>>>
>>
>
>

Re: coref results seem weird

Posted by Jörn Kottmann <ko...@gmail.com>.

On 04/25/2013 03:15 PM, Svetoslav Marinov wrote:
> What corpus has the English Coref module been trained on?

I contributed code to train on MUC data, but there are still a few 
problems, with detecting
possible mentions in the training data. If you want to give that a try I 
can help to get you started.
As far as I know the coref models have been trained on MUC data plus 
some private data, but I am not
sure if that is correct.

> Can someone provide some guidance into which language specific resources
> (modulo Sentence splitters, tokenizers, POS tagger, Parser and NER) are
> needed in order to get the coreference working for a new language. A
> Wordnet? What else?

Input needs to be:
- Sentence splitted
- Tokenized
- Either full or shallow parse, depending on how you trained the coref model

If you don't have a wordnet dict for your language you can probably 
disable to piece
of feature generation which uses it. I don't know how that will affect 
the performance.

We will move the coref component to the sandbox for the next release and 
hopefully get some help
to refactor it so it can be moved back to the tools package.

Having a second coref component, e.g. rule based would also be nice.

Jörn

Re: coref results seem weird

Posted by Rodrigo Agerri <ro...@ehu.es>.

Hello,

CoNLL 2011 shared tasks has information about many types of approaches to
develop coreference system. The winner (Stanford) was rule based.

http://conll.cemantix.org/2011/introduction.html

Cheers,

Rodrigo

On Thu, Apr 25, 2013 at 3:15 PM, Svetoslav Marinov <
svetoslav.marinov@findwise.com> wrote:

> What corpus has the English Coref module been trained on?
>
> Can someone provide some guidance into which language specific resources
> (modulo Sentence splitters, tokenizers, POS tagger, Parser and NER) are
> needed in order to get the coreference working for a new language. A
> Wordnet? What else?
>
> Thank you in advance!
>
> Best,
> Svetoslav
>
> On 2013-03-06 14:15, "Jim - FooBar();" <ji...@gmail.com> wrote:
>
> >right, finally the moment of truth! here is what I get using your loop:
> >
> >     Mention set:: [  this British industrial conglomerate   ]
> >     Mention set:: [  a director   ]
> >     Mention set:: [  Consolidated Gold Fields PLC   ]
> >     Mention set:: [  chairman  :: former chairman   ]
> >     Mention set:: [  55 years   ]
> >     Mention set:: [  Rudolph Agnew   ]
> >     Mention set:: [  Elsevier N . V .  :: the Dutch publishing group   ]
> >     Mention set:: [  Mr . Vinken   ]
> >     Mention set:: [  a nonexecutive director Nov . 29   ]
> >     Mention set:: [  the board   ]
> >     Mention set:: [  61 years   ]
> >     Mention set:: [  Pierre Vinken   ]
> >
> >As you can see I am missing these 2 which do seem correct (in your
> >output):
> >
> >  Mention set:: [ Pierre Vinken  :: Mr. Vinken  ]
> >  Mention set:: [ a nonexecutive director  :: chairman  :: former
> >chairman  :: a director  ]
> >
> >Jim
> >
> >ps: I can confirm that the NEs can be retrieved correctly from the
> >parse-tree just like yours...
> >
> >
> >
> >On 06/03/13 12:19, Jim - FooBar(); wrote:
> >> ok found your code and it answers all my questions!....I'll do the
> >> same now and see what happens... :)
> >>
> >> Jim
> >>
> >>
> >> On 06/03/13 12:01, Jim - FooBar(); wrote:
> >>> I'm sorry I forgot another thing...how do you ask for the NEs from
> >>> the Mention sets? I can only ask for the NEs from the Parse object...
> >>>
> >>> Jim
> >>>
> >>>
> >>> On 06/03/13 11:58, Jim - FooBar(); wrote:
> >>>> Hi there,
> >>>>
> >>>> I apologise for the late reply but I've been a bit ill the past few
> >>>> days...
> >>>> So I've got some good news and some bad news...let me explain:
> >>>>
> >>>> here are my parses for each sentence (notice how they are identical
> >>>> to yours -> GOOD news!):
> >>>> -------------------------------------
> >>>> (TOP (S (NP (person (NP (NNP Pierre) (NNP Vinken)) (, ,)) (ADJP (NP
> >>>> (CD 61) (NNS years)) (JJ old))) (, ,) (VP (MD will) (VP (VB join)
> >>>> (NP (DT the) (NN board)) (PP (IN as) (NP (DT a) (JJ nonexecutive)
> >>>> (NN director) (NNP Nov) (NNP .) (CD 29))))) (. .)))
> >>>>
> >>>> (TOP (S (NP (NNP Mr) (. .) (NNP Vinken)) (VP (VBZ is) (NP (NP (NN
> >>>> chairman)) (PP (IN of) (NP (NP (NNP Elsevier) (NNP N) (NNP .) (NNP
> >>>> V) (NNP .)) (, ,) (NP (DT the) (JJ Dutch) (NN publishing) (NN
> >>>> group)))))) (. .)))
> >>>>
> >>>> (TOP (NP (person (NP (NNP Rudolph) (NNP Agnew)) (, ,)) (UCP (ADJP
> >>>> (NP (CD 55) (NNS years)) (JJ old)) (CC and) (S (NP (NP (JJ former)
> >>>> (NN chairman)) (PP (IN of) (NP (NNP Consolidated) (NNP Gold) (NNP
> >>>> Fields) (NNP PLC)))) (, ,) (VP (VBD was) (VP (VBN named) (S (NP (NP
> >>>> (DT a) (NN director)) (PP (IN of) (NP (DT this) (JJ British) (JJ
> >>>> industrial) (NN conglomerate))))))))) (. .)))
> >>>> --------------------------------
> >>>>
> >>>> Doing this right is more than half the story for the
> >>>> coref-linker...Now, though there is a slight problem. What exactly
> >>>> is a Mention set in your output? How come you're not getting an
> >>>> array of DiscourseEntities back? In addition, are you filtering the
> >>>> resulting array for entities with size more than 1?
> >>>>
> >>>> I have to say, your output does seem correct from a coreference
> >>>> resolution perspective...the problem is I can't understand why we're
> >>>> getting different results...If you could explain what is a
> >>>> MentionSet that would be great...
> >>>>
> >>>> thanks again,
> >>>>
> >>>> Jim
> >>>>
> >>>>
> >>>>
> >>>> On 04/03/13 05:29, Ant B wrote:
> >>>>> Hi Jim,
> >>>>>
> >>>>> No problem - a good excuse to tidy the code.  I added a few
> >>>>> println() calls to display input text, sentence parse objects,
> >>>>> coreference mention sets and named entities in those sets. Note
> >>>>> that I only added "person" NERs to the sentence parse.
> >>>>>
> >>>>>
> >>>>> <start of code output>
> >>>>>
> >>>>> Input sentences::
> >>>>> Pierre Vinken, 61 years old, will join the board as a nonexecutive
> >>>>> director Nov. 29. Mr. Vinken is chairman of Elsevier N.V., the
> >>>>> Dutch publishing group. Rudolph Agnew, 55 years old and former
> >>>>> chairman of Consolidated Gold Fields PLC, was named a director of
> >>>>> this British industrial conglomerate.
> >>>>>
> >>>>> Sentence#1 parse after POS & NER tag:
> >>>>> (TOP (S (NP (person (NP (NNP Pierre) (NNP Vinken))(, ,)) (ADJP (NP
> >>>>> (CD 61) (NNS years)) (JJ old)))(, ,) (VP (MD will) (VP (VB join)
> >>>>> (NP (DT the) (NN board)) (PP (IN as) (NP (NP (DT a) (JJ
> >>>>> nonexecutive) (NN director)) (NP (NNP Nov.) (CD 29))))))(. .)))
> >>>>>
> >>>>> Sentence#2 parse after POS & NER tag:
> >>>>> (TOP (S (NP (NNP Mr.) (NNP Vinken)) (VP (VBZ is) (NP (NP (NN
> >>>>> chairman)) (PP (IN of) (NP (NP (NNP Elsevier) (NNP N.V.))(, ,) (NP
> >>>>> (DT the) (JJ Dutch) (NN publishing) (NN group))))))(. .)))
> >>>>>
> >>>>> Sentence#3 parse after POS & NER tag:
> >>>>> (TOP (NP (person (NP (NNP Rudolph) (NNP Agnew))(, ,)) (UCP (ADJP
> >>>>> (NP (CD 55) (NNS years)) (JJ old)) (CC and) (S (NP (NP (JJ former)
> >>>>> (NN chairman)) (PP (IN of) (NP (NNP Consolidated) (NNP Gold) (NNP
> >>>>> Fields) (NNP PLC))))(, ,) (VP (VBD was) (VP (VBN named) (S (NP (NP
> >>>>> (DT a) (NN director)) (PP (IN of) (NP (DT this) (JJ British) (JJ
> >>>>> industrial) (NN conglomerate)))))))))(. .)))
> >>>>>
> >>>>> Now displaying all discourse entities::
> >>>>>     Mention set:: [ this British industrial conglomerate  ]
> >>>>>     Mention set:: [ a nonexecutive director  :: chairman  :: former
> >>>>> chairman  :: a director  ]
> >>>>>     Mention set:: [ Consolidated Gold Fields PLC  ]
> >>>>>     Mention set:: [ 55 years  ]
> >>>>>     Mention set:: [ Rudolph Agnew  ]
> >>>>>     Mention set:: [ Elsevier N.V.  :: the Dutch publishing group  ]
> >>>>>     Mention set:: [ Pierre Vinken  :: Mr. Vinken  ]
> >>>>>     Mention set:: [ Nov. 29  ]
> >>>>>     Mention set:: [ the board  ]
> >>>>>     Mention set:: [ 61 years  ]
> >>>>>
> >>>>>
> >>>>> Now printing out the named entities from mention sets::
> >>>>>     [Rudolph Agnew ]
> >>>>>     [Pierre Vinken ]
> >>>>>
> >>>>> <end of code output>
> >>>>>
> >>>>>
> >>>>> I do not know for certain that my code is correct, so thanks for
> >>>>> the chance to compare data.  I think this matches up with your
> >>>>> results.
> >>>>>
> >>>>> Let me know if you want to hack around any further - I'd really
> >>>>> like a correct, reviewed & validated coreference example. I'm sure
> >>>>> others would benefit from such an example too.
> >>>>>
> >>>>>
> >>>>> Ant
> >>>>>
> >>>>> On Mar 3, 2013, at 12:37 PM, Jim - FooBar(); <ji...@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>>> do you still happen to have that project locally in your hard
> >>>>>> drive? is there any chance you could run this and post the
> >>>>>> results? Alternatively, I'd have to clone the repo and give it a
> >>>>>> spin...I imagine it would be a lot easier for you as you are the
> >>>>>> author...
> >>>>>>
> >>>>>> let me know if you can't do that...I noticed in your code that you
> >>>>>> already use the sentence i am interested in which is good.... :)
> >>>>>>
> >>>>>> Jim
> >>>>>>
> >>>>>>
> >>>>>> On 03/03/13 03:29, amb.enthusiast@gmail.com wrote:
> >>>>>>> Hey guys,
> >>>>>>>
> >>>>>>> You may have already seen this, but I posted a really simple
> >>>>>>> "getting started" Eclipse project on my GitHub profile for the
> >>>>>>> coreference API:
> >>>>>>> https://github.com/amb-enthusiast/CoreferenceTest
> >>>>>>>
> >>>>>>> It is similar to what you have already explored, but may provide
> >>>>>>> some validation.
> >>>>>>>
> >>>>>>>
> >>>>>>> I recently experimented using NER results from Stanford CoreNLP
> >>>>>>> in sentence parse objects.  In cases where better named entity
> >>>>>>> info is added to the sentence parse, coref performance improves.
> >>>>>>>
> >>>>>>> Hope this helps in some way.
> >>>>>>>
> >>>>>>>
> >>>>>>> Ant
> >>>>>>>
> >>>>>>> ----- Reply message -----
> >>>>>>> From: "Jim - FooBar();" <ji...@gmail.com>
> >>>>>>> To: <us...@opennlp.apache.org>
> >>>>>>> Subject: coref results seem weird
> >>>>>>> Date: Sat, Mar 2, 2013 9:10 am
> >>>>>>>
> >>>>>>>
> >>>>>>> see this old message by Jorn... check the next message message in
> >>>>>>> the
> >>>>>>> thread as well. at some point he posts detailed code...it might be
> >>>>>>> easier to put together a dummy project in order to use the API
> >>>>>>> which is
> >>>>>>> more flexible...
> >>>>>>>
> >>>>>>> Jim
> >>>>>>>
> >>>>>>>
> >>>>>>>
> http://mail-archives.apache.org/mod_mbox/opennlp-users/201112.mbox/%
> >>>>>>>3C4ED76AF3.8020503@gmail.com%3E
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On 02/03/13 16:05, James Kosin wrote:
> >>>>>>>> I'm trying to use the CLI.  I have other code in another project
> >>>>>>>> that
> >>>>>>>> loads the dictionary properly using the property file to specify
> >>>>>>>> the
> >>>>>>>> information.  Coreference does it differently...
> >>>>>>>>
> >>>>>>>> On 3/2/2013 6:32 AM, Jim - FooBar(); wrote:
> >>>>>>>>> I should be able to help you...are you going through the cli or
> >>>>>>>>> the
> >>>>>>>>> API? I bet there is something wrong with the Wordnet directory
> >>>>>>>>> you're
> >>>>>>>>> passing...I had similar issues...
> >>>>>>>>> If you're using the API let me know and I'll send you a code
> >>>>>>>>> snippet
> >>>>>>>>> that may help...
> >>>>>>>>>
> >>>>>>>>> Jim
> >>>>>>>>>
> >>>>>>>>> On 02/03/13 05:17, James Kosin wrote:
> >>>>>>>>>> Jim,
> >>>>>>>>>>
> >>>>>>>>>> I can't seem to get past the NULL pointer exceptions when
> >>>>>>>>>> Coreferencer is trying to load the dictionaries. So, this will
> >>>>>>>>>>be
> >>>>>>>>>> much later now.  I'm going to sleep and play tooth fairy.
> >>>>>>>>>>
> >>>>>>>>>> James
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On 3/1/2013 8:18 AM, Jim - FooBar(); wrote:
> >>>>>>>>>>> Like you, I'm using the latest WOrdnet and JWNL (1.4 RC_3 is on
> >>>>>>>>>>> maven you don't need to build it from source)....
> >>>>>>>>>>> Now that you've set up your end could you please perform a
> >>>>>>>>>>> run on
> >>>>>>>>>>> the standard example sentence? In addition could you try to
> >>>>>>>>>>> add the
> >>>>>>>>>>> named-entities to the parse-tree?
> >>>>>>>>>>> If yes, please post your results here for comparison with mine?
> >>>>>>>>>>>
> >>>>>>>>>>> thanks a lot,
> >>>>>>>>>>>
> >>>>>>>>>>> Jim
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On 01/03/13 02:21, James Kosin wrote:
> >>>>>>>>>>>> Hi Jim,
> >>>>>>>>>>>>
> >>>>>>>>>>>> What version of the JWNL and WordNet dictionaries are you
> >>>>>>>>>>>> using?
> >>>>>>>>>>>> I never got much more than researching what it is used for,
> >>>>>>>>>>>>and
> >>>>>>>>>>>> its importance to handling the task.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I've just updated my end for the 3.1 WordNet dictionaries.
> >>>>>>>>>>>>But,
> >>>>>>>>>>>> I'm also using 1.4_rc3 from sources to build JWNL. The extJWNL
> >>>>>>>>>>>> seems to be more apt to handling more types of dictionaries
> >>>>>>>>>>>> (supporting UTF-8 and others), and actually creating and
> >>>>>>>>>>>> modifying
> >>>>>>>>>>>> them as well; which isn't needed when we are really only
> >>>>>>>>>>>> wanting
> >>>>>>>>>>>> read usage.
> >>>>>>>>>>>>
> >>>>>>>>>>>> James
> >>>>>>>>>>>>
> >>>>>>>>>>>> On 2/28/2013 4:49 AM, Jim foo.bar wrote:
> >>>>>>>>>>>>> Hi James,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> thanks for your reply and your comments but that is not quite
> >>>>>>>>>>>>> what I asked...I've looked at all the web resources related
> >>>>>>>>>>>>>to
> >>>>>>>>>>>>> the opennlp coref component, otherwise I would never have
> >>>>>>>>>>>>> gotten
> >>>>>>>>>>>>> it to work!
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> My problem is about the results it brings back, in
> >>>>>>>>>>>>> particular I'd
> >>>>>>>>>>>>> like to compare my produced discourse entities with someone
> >>>>>>>>>>>>> else's  on the same piece of text. Since I'm working on a
> >>>>>>>>>>>>> language other than Java, that would confirm that my code
> >>>>>>>>>>>>> is at
> >>>>>>>>>>>>> least correct. On a secondary note, I'd like to see how to
> >>>>>>>>>>>>> insert
> >>>>>>>>>>>>> the named-entities into the parse tree before deploying the
> >>>>>>>>>>>>> TrreBankLinker. I followed the instructions posted my Jorn
> >>>>>>>>>>>>> sometime last year but I 'm not sure how the output should
> >>>>>>>>>>>>> look
> >>>>>>>>>>>>> like .That is why I posted what I'm getting...Can you see any
> >>>>>>>>>>>>> 'person' named-entities in my DicourseEntities?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> More importantly, if you run the coref component on the
> >>>>>>>>>>>>> standard
> >>>>>>>>>>>>> example sentence (Pierre Vinken, ...) what do you get?
> >>>>>>>>>>>>> Could you
> >>>>>>>>>>>>> post the exact output?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Whoever psoted this:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> http://blog.dpdearing.com/2012/11/making-coreference-resolutio
> >>>>>>>>>>>>>n-with-opennlp-1-5-0-your-bitch/
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> did not try to insert any NEs into the parse tree. In
> >>>>>>>>>>>>> addition,
> >>>>>>>>>>>>> his output is slightly different than mine...I don't know
> >>>>>>>>>>>>> if that
> >>>>>>>>>>>>> is because of a newer version of JWNL.jar that I'm using or
> >>>>>>>>>>>>> something else...
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Jim
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On 28/02/13 02:51, James Kosin wrote:
> >>>>>>>>>>>>>> Jim,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Here is a place to start, with maybe some more examples:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> http://stackoverflow.com/questions/8629737/coreference-resolu
> >>>>>>>>>>>>>>tion-using-opennlp
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> James
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On 2/27/2013 1:26 PM, Jim - FooBar(); wrote:
> >>>>>>>>>>>>>>> Hmmm.... interesting! When I run it on these 2 simple
> >>>>>>>>>>>>>>> sentences:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> /"Mary likes pizza but she also likes kebabs. Knowing
> >>>>>>>>>>>>>>> her, I'd
> >>>>>>>>>>>>>>> give it 2 weeks before she turns massive!"/
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I get perfect results!
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> #<DiscourseEntity [ Mary, she, her, she ]>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> this demonstrates 3 things:
> >>>>>>>>>>>>>>> - my understanding of coref is indeed correct
> >>>>>>>>>>>>>>> - the coref component can link entities from separate
> >>>>>>>>>>>>>>> sentences
> >>>>>>>>>>>>>>> - possibly that my code is fine
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> any thoughts?
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Jim
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On 27/02/13 18:14, Jim - FooBar(); wrote:
> >>>>>>>>>>>>>>>> Hi all,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I finally managed to get coref working (phew!-my god
> >>>>>>>>>>>>>>>> that was
> >>>>>>>>>>>>>>>> tricky) but I'm slightly confused with the results so
> >>>>>>>>>>>>>>>> I'd like
> >>>>>>>>>>>>>>>> to see if anyone else has tried that out...Using the
> >>>>>>>>>>>>>>>> standard
> >>>>>>>>>>>>>>>> paragraph used in the other examples:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> /"Pierre Vinken, 61 years old, will join the board as a
> >>>>>>>>>>>>>>>> nonexecutive director Nov. 29. Mr. Vinken is chairman of
> >>>>>>>>>>>>>>>> Elsevier N.V., the Dutch publishing group. Rudolph
> >>>>>>>>>>>>>>>> Agnew, 55
> >>>>>>>>>>>>>>>> years old and former chairman of Consolidated Gold
> >>>>>>>>>>>>>>>> Fields PLC,
> >>>>>>>>>>>>>>>> was named a director of this British industrial
> >>>>>>>>>>>>>>>> conglomerate."/
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> deploying the coref component gives me the following:
> >>>>>>>>>>>>>>>> I must note that I'm trying to pass the named entities
> >>>>>>>>>>>>>>>> as well
> >>>>>>>>>>>>>>>> (person). I've confirmed that the spans are correctly
> >>>>>>>>>>>>>>>> identitified (3 spans for this particular example) and
> >>>>>>>>>>>>>>>> added
> >>>>>>>>>>>>>>>> to the parse tree via
> >>>>>>>>>>>>>>>> /opennlp.tools.parser.Parse.addNames//("person", span,
> >>>>>>>>>>>>>>>> parse.getTagNodes());/
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> [#<DiscourseEntity [ this British industrial
> >>>>>>>>>>>>>>>> conglomerate ]>,
> >>>>>>>>>>>>>>>>   #<DiscourseEntity [ a director of this British
> >>>>>>>>>>>>>>>>industrial
> >>>>>>>>>>>>>>>> conglomerate ]>,
> >>>>>>>>>>>>>>>>   #<DiscourseEntity [ Consolidated Gold Fields PLC ]>,
> >>>>>>>>>>>>>>>>   #<DiscourseEntity [ chairman of Elsevier N . V . , the
> >>>>>>>>>>>>>>>> Dutch
> >>>>>>>>>>>>>>>> publishing group, former chairman of Consolidated Gold
> >>>>>>>>>>>>>>>> Fields
> >>>>>>>>>>>>>>>> PLC ]>,
> >>>>>>>>>>>>>>>>   #<DiscourseEntity [ 55 years ]>,
> >>>>>>>>>>>>>>>>   #<DiscourseEntity [ Rudolph Agnew , 55 years old and
> >>>>>>>>>>>>>>>> former
> >>>>>>>>>>>>>>>> chairman of Consolidated Gold Fields PLC , was named a
> >>>>>>>>>>>>>>>> director of this British industrial conglomerate . ]>,
> >>>>>>>>>>>>>>>>   #<DiscourseEntity [ Elsevier N . V . , the Dutch
> >>>>>>>>>>>>>>>> publishing
> >>>>>>>>>>>>>>>> group, the Dutch publishing group ]>,
> >>>>>>>>>>>>>>>>   #<DiscourseEntity [ Mr . Vinken ]>,
> >>>>>>>>>>>>>>>>   #<DiscourseEntity [ a nonexecutive director Nov . 29 ]>,
> >>>>>>>>>>>>>>>>   #<DiscourseEntity [ the board ]>,
> >>>>>>>>>>>>>>>>   #<DiscourseEntity [ 61 years ]>,
> >>>>>>>>>>>>>>>>   #<DiscourseEntity [ Pierre Vinken , 61 years old ]>
> >>>>>>>>>>>>>>>> ]
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> *filtering for more than 1 mentions (per Jorn's
> >>>>>>>>>>>>>>>>suggestion)
> >>>>>>>>>>>>>>>> gives back:*
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> [#<DiscourseEntity [ chairman of Elsevier N . V . , the
> >>>>>>>>>>>>>>>> Dutch
> >>>>>>>>>>>>>>>> publishing group, former chairman of Consolidated Gold
> >>>>>>>>>>>>>>>> Fields
> >>>>>>>>>>>>>>>> PLC ]>
> >>>>>>>>>>>>>>>>   #<DiscourseEntity [ Elsevier N . V . , the Dutch
> >>>>>>>>>>>>>>>> publishing
> >>>>>>>>>>>>>>>> group, the Dutch publishing group ]>
> >>>>>>>>>>>>>>>> ]
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Assuming that this is what it's supposed to output, can
> >>>>>>>>>>>>>>>> someone explain this? First of all where are the
> >>>>>>>>>>>>>>>> named-entities? Secondly, out of the 2 filtered
> >>>>>>>>>>>>>>>> DiscourseEntities, both seem plain wrong! Moreover,
> >>>>>>>>>>>>>>>> where is
> >>>>>>>>>>>>>>>> #<DiscourseEntity [Rudolph Agnew, //former chairman of
> >>>>>>>>>>>>>>>> Consolidated Gold Fields PLC/,/ the Dutch publishing
> >>>>>>>>>>>>>>>>group,
> >>>>>>>>>>>>>>>> director of this British industrial conglomerate ]> ???
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Either I'm not understanding coreference, or I've coded
> >>>>>>>>>>>>>>>>the
> >>>>>>>>>>>>>>>> thing wrong or the models is not very good! Which one is
> >>>>>>>>>>>>>>>> it?
> >>>>>>>>>>>>>>>> Has anyone else attempted this? Can we compare results
> >>>>>>>>>>>>>>>> on this
> >>>>>>>>>>>>>>>> particular sentence?
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> thanks in advance :)
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Jim
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> ps: my code is in Clojure but it is based on a code
> >>>>>>>>>>>>>>>>snippet
> >>>>>>>>>>>>>>>> provided by Jorn to someone on the mailing list last
> >>>>>>>>>>>>>>>> year . I
> >>>>>>>>>>>>>>>> can easily provide it but I don't think it will be of much
> >>>>>>>>>>>>>>>> help...
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>
> >>>>
> >>>
> >>
> >
> >
>
>
>