You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@opennlp.apache.org by Chris Collins <ch...@yahoo.com> on 2012/08/05 03:40:28 UTC

Anyone see issues with jwnl library hangs?

I am building a classifier with OpenNLP and leveraging JWNLDictionary.  In my experiments I am finding after many invocations of the classifier it hangs in jwnl.  Specifically doing a read.  I am using WordNet 3.0 (normal princeton distro, not stanfords).  



The thread dump is below (well not all of it but the OpenNLP + jwnl part.  In the halt case it was trying to work with the word "found_r_n_rnhttpsttlc_blablacompost_show_post_full_view_dw_w_b_ar_dxxcfazbay_ppid_rnrn"

Now clearly that isnt a word so I should work on my tokenization :-}  

I tried this in a single thread and tried the default jwnl defined in the opennlp pom and also 1.4 rc3

Any pointers would be helpful.

Cheers

C


java.lang.Thread.State: RUNNABLE
	  at java.io.RandomAccessFile.read(RandomAccessFile.java:-1)
	  at java.io.RandomAccessFile.readLine(RandomAccessFile.java:871)
	  at net.didion.jwnl.princeton.file.PrincetonRandomAccessDictionaryFile.readLine(PrincetonRandomAccessDictionaryFile.java:48)
	  at net.didion.jwnl.dictionary.file_manager.FileManagerImpl.getIndexedLinePointer(FileManagerImpl.java:220)
	  - locked <0x1071> (a net.didion.jwnl.princeton.file.PrincetonRandomAccessDictionaryFile)
	  at net.didion.jwnl.dictionary.FileBackedDictionary.getIndexWord(FileBackedDictionary.java:171)
	  at net.didion.jwnl.dictionary.morph.LookupIndexWordOperation.execute(LookupIndexWordOperation.java:15)
	  at net.didion.jwnl.dictionary.morph.AbstractDelegatingOperation.delegate(AbstractDelegatingOperation.java:47)
	  at net.didion.jwnl.dictionary.morph.TokenizerOperation.tryAllCombinations(TokenizerOperation.java:131)
	  at net.didion.jwnl.dictionary.morph.TokenizerOperation.tryAllCombinations(TokenizerOperation.java:102)
	  at net.didion.jwnl.dictionary.morph.TokenizerOperation.execute(TokenizerOperation.java:75)
	  at net.didion.jwnl.dictionary.morph.DefaultMorphologicalProcessor$LookupInfo.executeNextOperation(DefaultMorphologicalProcessor.java:172)
	  at net.didion.jwnl.dictionary.morph.DefaultMorphologicalProcessor.lookupNextBaseForm(DefaultMorphologicalProcessor.java:125)
	  at net.didion.jwnl.dictionary.morph.DefaultMorphologicalProcessor.lookupAllBaseForms(DefaultMorphologicalProcessor.java:142)
	  at opennlp.tools.coref.mention.JWNLDictionary.getLemmas(JWNLDictionary.java:99)


Re: Anyone see issues with jwnl library hangs?

Posted by Aliaksandr Autayeu <al...@autayeu.com>.
I had similar issues with JWNL, but long time ago, I don't remember details
now. A small piece of code to reproduce the issue would help a lot looking
into it ;)

Aliaksandr

On Mon, Aug 6, 2012 at 10:39 PM, Jörn Kottmann <ko...@gmail.com> wrote:

> Hello,
>
> never experienced that issue.
>
> Its blocking in RandomAccessFile.readLine,
> according to the javadoc it should not block forever, one of the following
> conditions should usually be reached quickly.
>
> "This method blocks until a newline character is read, a carriage return
> and the byte following it are read (to see if it is a newline), the end of
> the file is reached, or an exception is thrown."
>
> http://docs.oracle.com/javase/**1.4.2/docs/api/java/io/**
> RandomAccessFile.html#**readLine()<http://docs.oracle.com/javase/1.4.2/docs/api/java/io/RandomAccessFile.html#readLine()>
>
> Not sure whats going wrong there. Can you post some code to reproduce it?
> Maybe the call is reading in too much data.
>
> Jörn
>
>
> On 08/05/2012 03:40 AM, Chris Collins wrote:
>
>> I am building a classifier with OpenNLP and leveraging JWNLDictionary.
>>  In my experiments I am finding after many invocations of the classifier it
>> hangs in jwnl.  Specifically doing a read.  I am using WordNet 3.0 (normal
>> princeton distro, not stanfords).
>>
>>
>>
>> The thread dump is below (well not all of it but the OpenNLP + jwnl part.
>>  In the halt case it was trying to work with the word
>> "found_r_n_rnhttpsttlc_**blablacompost_show_post_full_**
>> view_dw_w_b_ar_dxxcfazbay_**ppid_rnrn"
>>
>> Now clearly that isnt a word so I should work on my tokenization :-}
>>
>> I tried this in a single thread and tried the default jwnl defined in the
>> opennlp pom and also 1.4 rc3
>>
>> Any pointers would be helpful.
>>
>> Cheers
>>
>> C
>>
>>
>> java.lang.Thread.State: RUNNABLE
>>           at java.io.RandomAccessFile.read(**RandomAccessFile.java:-1)
>>           at java.io.RandomAccessFile.**readLine(RandomAccessFile.**
>> java:871)
>>           at net.didion.jwnl.princeton.**file.**
>> PrincetonRandomAccessDictionar**yFile.readLine(**
>> PrincetonRandomAccessDictionar**yFile.java:48)
>>           at net.didion.jwnl.dictionary.**file_manager.FileManagerImpl.**
>> getIndexedLinePointer(**FileManagerImpl.java:220)
>>           - locked <0x1071> (a net.didion.jwnl.princeton.**file.**
>> PrincetonRandomAccessDictionar**yFile)
>>           at net.didion.jwnl.dictionary.**FileBackedDictionary.**
>> getIndexWord(**FileBackedDictionary.java:171)
>>           at net.didion.jwnl.dictionary.**morph.**
>> LookupIndexWordOperation.**execute(**LookupIndexWordOperation.java:**15)
>>           at net.didion.jwnl.dictionary.**morph.**
>> AbstractDelegatingOperation.**delegate(**AbstractDelegatingOperation.**
>> java:47)
>>           at net.didion.jwnl.dictionary.**morph.TokenizerOperation.**
>> tryAllCombinations(**TokenizerOperation.java:131)
>>           at net.didion.jwnl.dictionary.**morph.TokenizerOperation.**
>> tryAllCombinations(**TokenizerOperation.java:102)
>>           at net.didion.jwnl.dictionary.**morph.TokenizerOperation.**
>> execute(TokenizerOperation.**java:75)
>>           at net.didion.jwnl.dictionary.**morph.**
>> DefaultMorphologicalProcessor$**LookupInfo.**executeNextOperation(**
>> DefaultMorphologicalProcessor.**java:172)
>>           at net.didion.jwnl.dictionary.**morph.**
>> DefaultMorphologicalProcessor.**lookupNextBaseForm(**
>> DefaultMorphologicalProcessor.**java:125)
>>           at net.didion.jwnl.dictionary.**morph.**
>> DefaultMorphologicalProcessor.**lookupAllBaseForms(**
>> DefaultMorphologicalProcessor.**java:142)
>>           at opennlp.tools.coref.mention.**JWNLDictionary.getLemmas(**
>> JWNLDictionary.java:99)
>>
>>
>

Re: Anyone see issues with jwnl library hangs?

Posted by Jörn Kottmann <ko...@gmail.com>.
Hello,

never experienced that issue.

Its blocking in RandomAccessFile.readLine,
according to the javadoc it should not block forever, one of the following
conditions should usually be reached quickly.

"This method blocks until a newline character is read, a carriage return 
and the byte following it are read (to see if it is a newline), the end 
of the file is reached, or an exception is thrown."

http://docs.oracle.com/javase/1.4.2/docs/api/java/io/RandomAccessFile.html#readLine()

Not sure whats going wrong there. Can you post some code to reproduce it?
Maybe the call is reading in too much data.

Jörn

On 08/05/2012 03:40 AM, Chris Collins wrote:
> I am building a classifier with OpenNLP and leveraging JWNLDictionary.  In my experiments I am finding after many invocations of the classifier it hangs in jwnl.  Specifically doing a read.  I am using WordNet 3.0 (normal princeton distro, not stanfords).
>
>
>
> The thread dump is below (well not all of it but the OpenNLP + jwnl part.  In the halt case it was trying to work with the word "found_r_n_rnhttpsttlc_blablacompost_show_post_full_view_dw_w_b_ar_dxxcfazbay_ppid_rnrn"
>
> Now clearly that isnt a word so I should work on my tokenization :-}
>
> I tried this in a single thread and tried the default jwnl defined in the opennlp pom and also 1.4 rc3
>
> Any pointers would be helpful.
>
> Cheers
>
> C
>
>
> java.lang.Thread.State: RUNNABLE
> 	  at java.io.RandomAccessFile.read(RandomAccessFile.java:-1)
> 	  at java.io.RandomAccessFile.readLine(RandomAccessFile.java:871)
> 	  at net.didion.jwnl.princeton.file.PrincetonRandomAccessDictionaryFile.readLine(PrincetonRandomAccessDictionaryFile.java:48)
> 	  at net.didion.jwnl.dictionary.file_manager.FileManagerImpl.getIndexedLinePointer(FileManagerImpl.java:220)
> 	  - locked <0x1071> (a net.didion.jwnl.princeton.file.PrincetonRandomAccessDictionaryFile)
> 	  at net.didion.jwnl.dictionary.FileBackedDictionary.getIndexWord(FileBackedDictionary.java:171)
> 	  at net.didion.jwnl.dictionary.morph.LookupIndexWordOperation.execute(LookupIndexWordOperation.java:15)
> 	  at net.didion.jwnl.dictionary.morph.AbstractDelegatingOperation.delegate(AbstractDelegatingOperation.java:47)
> 	  at net.didion.jwnl.dictionary.morph.TokenizerOperation.tryAllCombinations(TokenizerOperation.java:131)
> 	  at net.didion.jwnl.dictionary.morph.TokenizerOperation.tryAllCombinations(TokenizerOperation.java:102)
> 	  at net.didion.jwnl.dictionary.morph.TokenizerOperation.execute(TokenizerOperation.java:75)
> 	  at net.didion.jwnl.dictionary.morph.DefaultMorphologicalProcessor$LookupInfo.executeNextOperation(DefaultMorphologicalProcessor.java:172)
> 	  at net.didion.jwnl.dictionary.morph.DefaultMorphologicalProcessor.lookupNextBaseForm(DefaultMorphologicalProcessor.java:125)
> 	  at net.didion.jwnl.dictionary.morph.DefaultMorphologicalProcessor.lookupAllBaseForms(DefaultMorphologicalProcessor.java:142)
> 	  at opennlp.tools.coref.mention.JWNLDictionary.getLemmas(JWNLDictionary.java:99)
>