You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@opennlp.apache.org by Boris Galitsky <bg...@hotmail.com> on 2012/09/19 23:57:22 UTC

to release Similarity component

Hi guys
   I would like to complete this project.This tickets  https://issues.apache.org/jira/browse/OPENNLP-497
is supposed to conclude it, I edited pom.xml and added files similar to what apache-opennlp-1.5.2-incubating-bin.zip has.
I manually formed the distribution archive and attached it to this ticket.
As an additional round of verification that Similarity is working, besides a number of application areas (which live in junits) I also built a SOLR search handler which re-rankssearch results according to similarity of parse trees for questions and answers. This way SOLR uses OpenNLP to improve search relevance (will create a ticket and add the request handler).
What are the next steps?
Best regardsBoris

RE: My colleagues and I verified Similarity component is working and passing tests

Posted by Boris Galitsky <bg...@hotmail.com>.

Hi  Jörn
 I am stuck with the deployment script, somewhere around the gpg bug http://jira.codehaus.org/browse/MGPG-9so that the processes hangs. I tried all options mentioned in the discussions for this ticket, but nothing worked
I keep trying.
RegardsBoris

> Date: Tue, 23 Oct 2012 13:13:24 +0200
> From: kottmann@gmail.com
> To: dev@opennlp.apache.org
> Subject: Re: My colleagues and I verified  Similarity component is working and passing tests
> 
> Hello,
> 
> are you ready now to build a release candidate?
> 
> Jörn
> 
> On 10/09/2012 06:30 PM, Boris Galitsky wrote:
> > Hello
> >    Just wanted to say we can proceed with the release of 'Similarity' component
> > RegardsBoris 		 	   		
>

Re: My colleagues and I verified Similarity component is working and passing tests

Posted by Jörn Kottmann <ko...@gmail.com>.

Hello,

are you ready now to build a release candidate?

Jörn

On 10/09/2012 06:30 PM, Boris Galitsky wrote:
> Hello
>    Just wanted to say we can proceed with the release of 'Similarity' component
> RegardsBoris

My colleagues and I verified Similarity component is working and passing tests

Posted by Boris Galitsky <bg...@hotmail.com>.

Hello
  Just wanted to say we can proceed with the release of 'Similarity' component
RegardsBoris

RE: to release Similarity component

Posted by Boris Galitsky <bg...@hotmail.com>.

Hello
  All tests should run now, I've verified on a couple of machines other than my own
RegardsBoris





> Date: Tue, 2 Oct 2012 23:08:45 +0200
> From: kottmann@gmail.com
> To: dev@opennlp.apache.org
> Subject: Re: to release Similarity component
> 
> Hello,
> 
> started to look into the packaging issue, but I cannot build it cause
> of failing tests. The test should run trough before we release it.
> 
> It looks like the test depend on our English models, I can for example
> get this exception:
> java.io.FileNotFoundException: 
> /home/xyz/opennlp/sandbox/opennlp-similarity/src/test/resources/models/en-chunker.bin 
> (No such file or directory)
>      at java.io.FileInputStream.open(Native Method)
>      at java.io.FileInputStream.<init>(FileInputStream.java:137)
>      at java.io.FileInputStream.<init>(FileInputStream.java:96)
>      at 
> opennlp.tools.textsimilarity.chunker2matcher.ParserChunker2MatcherProcessor.initializeChunker(ParserChunker2MatcherProcessor.java:658)
>      at 
> opennlp.tools.textsimilarity.chunker2matcher.ParserChunker2MatcherProcessor.<init>(ParserChunker2MatcherProcessor.java:119)
>      at 
> opennlp.tools.textsimilarity.chunker2matcher.ParserChunker2MatcherProcessor.getInstance(ParserChunker2MatcherProcessor.java:140)
>      at 
> opennlp.tools.textsimilarity.chunker2matcher.ParserChunker2MatcherProcessorTest.testGroupedPhrasesFormer(ParserChunker2MatcherProcessorTest.java:34)
> 
> 
> The models cannot be included in subversion or in the distribution 
> because they are not
> AL 2.0 licensed. So we should not have tests depending on them.
> 
> Can you fix this? In my opinion the tests should just use pre-annotated 
> data and do not use
> any of our models to produce this data on the fly.
> 
> Please get a fresh checkout of the opennlp-similarity folder, and try to 
> get it build
> with mvn install. That needs to work out of the box. There were also 
> other test failures,
> e.g. one where it could not parse a sentence.
> 
> Jörn
> 
> On 09/19/2012 11:57 PM, Boris Galitsky wrote:
> > Hi guys
> >     I would like to complete this project.This tickets  https://issues.apache.org/jira/browse/OPENNLP-497
> > is supposed to conclude it, I edited pom.xml and added files similar to what apache-opennlp-1.5.2-incubating-bin.zip has.
> > I manually formed the distribution archive and attached it to this ticket.
> > As an additional round of verification that Similarity is working, besides a number of application areas (which live in junits) I also built a SOLR search handler which re-rankssearch results according to similarity of parse trees for questions and answers. This way SOLR uses OpenNLP to improve search relevance (will create a ticket and add the request handler).
> > What are the next steps?
> > Best regardsBoris 		 	   		
>

RE: to release Similarity component

Posted by Boris Galitsky <bg...@hotmail.com>.

Hello all
 I am caching parsing results in a csv file, and read this file if the models are not available. It all works locally, let me check out again and see if I can reproduce what you are reporting.
RegardsBoris






> Date: Tue, 2 Oct 2012 23:08:45 +0200
> From: kottmann@gmail.com
> To: dev@opennlp.apache.org
> Subject: Re: to release Similarity component
> 
> Hello,
> 
> started to look into the packaging issue, but I cannot build it cause
> of failing tests. The test should run trough before we release it.
> 
> It looks like the test depend on our English models, I can for example
> get this exception:
> java.io.FileNotFoundException: 
> /home/xyz/opennlp/sandbox/opennlp-similarity/src/test/resources/models/en-chunker.bin 
> (No such file or directory)
>      at java.io.FileInputStream.open(Native Method)
>      at java.io.FileInputStream.<init>(FileInputStream.java:137)
>      at java.io.FileInputStream.<init>(FileInputStream.java:96)
>      at 
> opennlp.tools.textsimilarity.chunker2matcher.ParserChunker2MatcherProcessor.initializeChunker(ParserChunker2MatcherProcessor.java:658)
>      at 
> opennlp.tools.textsimilarity.chunker2matcher.ParserChunker2MatcherProcessor.<init>(ParserChunker2MatcherProcessor.java:119)
>      at 
> opennlp.tools.textsimilarity.chunker2matcher.ParserChunker2MatcherProcessor.getInstance(ParserChunker2MatcherProcessor.java:140)
>      at 
> opennlp.tools.textsimilarity.chunker2matcher.ParserChunker2MatcherProcessorTest.testGroupedPhrasesFormer(ParserChunker2MatcherProcessorTest.java:34)
> 
> 
> The models cannot be included in subversion or in the distribution 
> because they are not
> AL 2.0 licensed. So we should not have tests depending on them.
> 
> Can you fix this? In my opinion the tests should just use pre-annotated 
> data and do not use
> any of our models to produce this data on the fly.
> 
> Please get a fresh checkout of the opennlp-similarity folder, and try to 
> get it build
> with mvn install. That needs to work out of the box. There were also 
> other test failures,
> e.g. one where it could not parse a sentence.
> 
> Jörn
> 
> On 09/19/2012 11:57 PM, Boris Galitsky wrote:
> > Hi guys
> >     I would like to complete this project.This tickets  https://issues.apache.org/jira/browse/OPENNLP-497
> > is supposed to conclude it, I edited pom.xml and added files similar to what apache-opennlp-1.5.2-incubating-bin.zip has.
> > I manually formed the distribution archive and attached it to this ticket.
> > As an additional round of verification that Similarity is working, besides a number of application areas (which live in junits) I also built a SOLR search handler which re-rankssearch results according to similarity of parse trees for questions and answers. This way SOLR uses OpenNLP to improve search relevance (will create a ticket and add the request handler).
> > What are the next steps?
> > Best regardsBoris 		 	   		
>

Re: to release Similarity component

Posted by Jörn Kottmann <ko...@gmail.com>.

Hello,

started to look into the packaging issue, but I cannot build it cause
of failing tests. The test should run trough before we release it.

It looks like the test depend on our English models, I can for example
get this exception:
java.io.FileNotFoundException: 
/home/xyz/opennlp/sandbox/opennlp-similarity/src/test/resources/models/en-chunker.bin 
(No such file or directory)
     at java.io.FileInputStream.open(Native Method)
     at java.io.FileInputStream.<init>(FileInputStream.java:137)
     at java.io.FileInputStream.<init>(FileInputStream.java:96)
     at 
opennlp.tools.textsimilarity.chunker2matcher.ParserChunker2MatcherProcessor.initializeChunker(ParserChunker2MatcherProcessor.java:658)
     at 
opennlp.tools.textsimilarity.chunker2matcher.ParserChunker2MatcherProcessor.<init>(ParserChunker2MatcherProcessor.java:119)
     at 
opennlp.tools.textsimilarity.chunker2matcher.ParserChunker2MatcherProcessor.getInstance(ParserChunker2MatcherProcessor.java:140)
     at 
opennlp.tools.textsimilarity.chunker2matcher.ParserChunker2MatcherProcessorTest.testGroupedPhrasesFormer(ParserChunker2MatcherProcessorTest.java:34)


The models cannot be included in subversion or in the distribution 
because they are not
AL 2.0 licensed. So we should not have tests depending on them.

Can you fix this? In my opinion the tests should just use pre-annotated 
data and do not use
any of our models to produce this data on the fly.

Please get a fresh checkout of the opennlp-similarity folder, and try to 
get it build
with mvn install. That needs to work out of the box. There were also 
other test failures,
e.g. one where it could not parse a sentence.

Jörn

On 09/19/2012 11:57 PM, Boris Galitsky wrote:
> Hi guys
>     I would like to complete this project.This tickets  https://issues.apache.org/jira/browse/OPENNLP-497
> is supposed to conclude it, I edited pom.xml and added files similar to what apache-opennlp-1.5.2-incubating-bin.zip has.
> I manually formed the distribution archive and attached it to this ticket.
> As an additional round of verification that Similarity is working, besides a number of application areas (which live in junits) I also built a SOLR search handler which re-rankssearch results according to similarity of parse trees for questions and answers. This way SOLR uses OpenNLP to improve search relevance (will create a ticket and add the request handler).
> What are the next steps?
> Best regardsBoris