You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucenenet.apache.org by Simon Svensson <si...@devhost.se> on 2012/05/27 10:26:04 UTC

Plans for Hunspell integration (and: How do I build the trunk?)

Hi,

First of things first; thanks for the warm welcome.

As mentioned earlier, I have ported lucene-hunspell[1],  which allows 
hunspell dictionaries to be used for stemming, to 
Lucene.Net.Analysis.Hunspell[2]. I'm using this with English and Swedish 
dictionaries, and I've got indications via commits, mails and questions 
that it is also used (or at least tried) with French and Croatian. It is 
my intentions to move this code into contrib, which brings the first of 
many questions; should it be added to Contrib.Analyzers, or a new project?

I'm currently experimenting with the build environment and making sure 
that all tools work properly on my machine. However, I'm greeted with 
several execution errors when executing a "build simple all release"; 
tests for SimpleFacetedSearch and SpellChecker calls non-existant 
overload of IndexReader.Open and Memory tests have wrong assembly name 
and output path. The build will proceed if I fix these errors, but some 
tests fail. (one being TestQueryParser.TextWildCard with "Query 
/term~0.7/ yielded /term~0.5/, expecting /term~0.7/"). These tests do 
also fail in Resharpers unittest-runner.

I've tried "build commit all release" (from the build information wiki 
page[3]) which fails with "NCover v3 does not appear to be installed". 
This is correct; I've been unable to find a free version of a NCover v3. 
Is the commit build target perhaps only meant for build servers?

I've copied lib\StyleCop.4.5 to C:\Program Files 
(x86)\MSBuild\StyleCop\v4.5 to remove the 
stylecop-4.5-could-not-be-found warnings. I expect to get a gazillion 
stylecop-related warnings when building (stylecop has never really liked 
me), but get none at all. Is the code perfect, or are no rules applied?

So, what's the correct way to build the trunk?

// Simon

[1] https://code.google.com/p/lucene-hunspell/
[2] https://github.com/sisve/Lucene.Net.Analysis.Hunspell
[3] https://cwiki.apache.org/LUCENENET/build-system-scripts.html

Re: Plans for Hunspell integration (and: How do I build the trunk?)

Posted by Stefan Bodewig <bo...@apache.org>.
On 2012-05-29, Christopher Currens wrote:

> "build commit all release" should work.  However, we have one or two issues
> with our current trunk state [...]

> I use the command: "build simple all release" which will clean, build and
> generate an html report for the tests.  It doesn't do stylecop rules,
> though you can do that after the fact with "build rules all release".  Make
> sure you have FxCop installed [...]

> For the record, though, if you WERE to create it as a separate project,
> this would be a normal workflow for it: [...]

Christopher, this is great.  Can you please paste your instructions to a
"how to build" page on the site or the wiki (or whatever you feel might
fit better)?

Cheers

        Stefan

Re: Plans for Hunspell integration (and: How do I build the trunk?)

Posted by Christopher Currens <cu...@gmail.com>.
Simon,

"build commit all release" should work.  However, we have one or two issues
with our current trunk state, as you can see.  It looks like an obsolete
method was removed in IndexReader, but not all tests/projects compiled and
run before it was committed.  Not a huge issue at all, since you've seen it
can easily be fixed (I've committed fixes to the things you've listed above
to trunk already).  I'll admit, that we *should* be able to build it using
the command you've tried, but unfortunately, we're not in a state where
that's possible.

I use the command: "build simple all release" which will clean, build and
generate an html report for the tests.  It doesn't do stylecop rules,
though you can do that after the fact with "build rules all release".  Make
sure you have FxCop installed in the expected location (on my computer,
this is C:\Program Files (x86)\Microsoft Fxcop 10.0), otherwise it will
*look* like it succeeded, but actually, it will have not run the rules at
all.  I believe it's set to be a warning if FxCop doesn't exist.  The same
goes for Stylecop.  Keep in mind you'll get warnings up the wazoo, but
that's because the code is full of issues.  We never did agree on a set of
rules past "use the ones Microsoft uses", so at this point, we're only
asking people try to follow the rules they see in the current code...as
ambiguous as that might be.  I don't agree with all of the Microsoft rules,
for example, but I would like to see a set of stylecop and resharper rules
put together and stored in the repo.

Anyway, as you might be able to tell, we do need some work done with our
build scripts.  Michael did a great job setting them up, but as the project
has evolved, they've become slightly neglected, and there are a few
problems.  We also don't have any documentation with showing how to build
lucene from the command-line, step by step.  If you'd be willing to do some
work in trying to clean them up or get documentation together, we
definitely would need it.  Perhaps, at the very least, you may create
issues for these things in JIRA.

As for your port of Hunspell, I would feel it would be best if it were
added to the Analyzers project, so you can just import the code and tests
into those respective projects.  Past that, you won't need to update the
build scripts, since it won't output any extra dlls or xmls that we would
need to deal with.

=====================

For the record, though, if you WERE to create it as a separate project,
this would be a normal workflow for it:

Put the source in a folder such as: trunk/src/contrib/Hunspell
Put the tests in a folder such as: trunk/test/contrib/Hunspell
-- Make sure you sign BOTH output dlls with the signed name key in
trunk/lib/Lucene.Net.snk

Create the following solutions for your project in trunk/build/vs2010:
-- contrib/Contrib.Hunspell.sln
-- test/Contrib.Hunspell.Test.sln

Add the Hunspell project to these existing solutions in trunk/build/vs2010:
-- contrib/Contrib.All.sln
-- test/Contrib.All.Test.sln

Add/Update the build scripts in the trunk/build/scripts directory:
-- Create a folder named Hunspell in this directory
---- create a document.targets and project.targets with the correct paths
(model it after an existing one)
---- in Contrib/document.targets, add the documentation sources for your
project to the existing list
---- in Contrib/project.targets, add the Hunspell/projects.targets as an
import
---- in Contrib/Lucene.Net.Contrib.nuspec, add the dll and xml files to the
existing list
---- repeat the previous 3 steps with the target files in the "All"
directory


Thanks,
Christopher


On Sun, May 27, 2012 at 1:26 AM, Simon Svensson <si...@devhost.se> wrote:

> Hi,
>
> First of things first; thanks for the warm welcome.
>
> As mentioned earlier, I have ported lucene-hunspell[1],  which allows
> hunspell dictionaries to be used for stemming, to
> Lucene.Net.Analysis.Hunspell[**2]. I'm using this with English and
> Swedish dictionaries, and I've got indications via commits, mails and
> questions that it is also used (or at least tried) with French and
> Croatian. It is my intentions to move this code into contrib, which brings
> the first of many questions; should it be added to Contrib.Analyzers, or a
> new project?
>
> I'm currently experimenting with the build environment and making sure
> that all tools work properly on my machine. However, I'm greeted with
> several execution errors when executing a "build simple all release"; tests
> for SimpleFacetedSearch and SpellChecker calls non-existant overload of
> IndexReader.Open and Memory tests have wrong assembly name and output path.
> The build will proceed if I fix these errors, but some tests fail. (one
> being TestQueryParser.TextWildCard with "Query /term~0.7/ yielded
> /term~0.5/, expecting /term~0.7/"). These tests do also fail in Resharpers
> unittest-runner.
>
> I've tried "build commit all release" (from the build information wiki
> page[3]) which fails with "NCover v3 does not appear to be installed". This
> is correct; I've been unable to find a free version of a NCover v3. Is the
> commit build target perhaps only meant for build servers?
>
> I've copied lib\StyleCop.4.5 to C:\Program Files
> (x86)\MSBuild\StyleCop\v4.5 to remove the stylecop-4.5-could-not-be-**found
> warnings. I expect to get a gazillion stylecop-related warnings when
> building (stylecop has never really liked me), but get none at all. Is the
> code perfect, or are no rules applied?
>
> So, what's the correct way to build the trunk?
>
> // Simon
>
> [1] https://code.google.com/p/**lucene-hunspell/<https://code.google.com/p/lucene-hunspell/>
> [2] https://github.com/sisve/**Lucene.Net.Analysis.Hunspell<https://github.com/sisve/Lucene.Net.Analysis.Hunspell>
> [3] https://cwiki.apache.org/**LUCENENET/build-system-**scripts.html<https://cwiki.apache.org/LUCENENET/build-system-scripts.html>
>

Re: Plans for Hunspell integration (and: How do I build the trunk?)

Posted by Simon Svensson <si...@devhost.se>.
I've found the issue; I'm running with CurrentCulture = sv_SE, and the 
tests presumes a culture that can parse "0.7". I've reported this as 
https://issues.apache.org/jira/browse/LUCENENET-490. This looks like a 
regression, the problem does not exist in 2.9.4.

What's the cleanest way to change these tests into using several 
cultures? Having an array of cultures to use, and iterating it in every 
test, sounds ... unclean. The SetCulture-attribute doesn't seem to work 
(perhaps a R# limitation), and it would still require duplication of 
test for every culture.

On 2012-05-29 20:03, Christopher Currens wrote:
> I can't reproduce the failed tests in TestQueryParser.TestWildCard.  Works
> for me in both debug and release builds, running via R# and Gallio.
>
> On Tue, May 29, 2012 at 10:45 AM, Prescott Nasser<ge...@hotmail.com>wrote:
>
>>
>>> my intentions to move this code into contrib, which brings the first of
>>> many questions; should it be added to Contrib.Analyzers, or a new
>> project?
>>
>> Analyzers sounds like the right space for it.
>>
>>> I'm currently experimenting with the build environment and making sure
>>> that all tools work properly on my machine. However, I'm greeted with
>>> several execution errors when executing a "build simple all release";
>>> tests for SimpleFacetedSearch and SpellChecker calls non-existant
>>> overload of IndexReader.Open and Memory tests have wrong assembly name
>>> and output path. The build will proceed if I fix these errors, but some
>>> tests fail. (one being TestQueryParser.TextWildCard with "Query
>>> /term~0.7/ yielded /term~0.5/, expecting /term~0.7/"). These tests do
>>> also fail in Resharpers unittest-runner.
>> I'll try to take a look at this this week. We've mostly been focusing on
>> the core, and I know that the contrib packages have started to fall to the
>> wayside. We need to take a good look at them, make sure they are the right
>> ports, and make the fixes to adjust to our api. If you do notice problems,
>> I'd encourage you to at the very least throw up a JIRA issue. If it turns
>> out it's not a problem, we can always close it.
>>> I've tried "build commit all release" (from the build information wiki
>>> page[3]) which fails with "NCover v3 does not appear to be installed".
>>> This is correct; I've been unable to find a free version of a NCover v3.
>>> Is the commit build target perhaps only meant for build servers?
>> I think it was Michael who did all the work around the build system, I
>> still build the old fashioned way... VS2010 - right click, build.
>>
>>> I've copied lib\StyleCop.4.5 to C:\Program Files
>>> (x86)\MSBuild\StyleCop\v4.5 to remove the
>>> stylecop-4.5-could-not-be-found warnings. I expect to get a gazillion
>>> stylecop-related warnings when building (stylecop has never really liked
>>> me), but get none at all. Is the code perfect, or are no rules applied?
>> That's a good question, we discussed style cop at one point, but I don't
>> think we every had a consensus on that.
>>

Re: Plans for Hunspell integration (and: How do I build the trunk?)

Posted by Christopher Currens <cu...@gmail.com>.
I can't reproduce the failed tests in TestQueryParser.TestWildCard.  Works
for me in both debug and release builds, running via R# and Gallio.

On Tue, May 29, 2012 at 10:45 AM, Prescott Nasser <ge...@hotmail.com>wrote:

>
>
> > my intentions to move this code into contrib, which brings the first of
> > many questions; should it be added to Contrib.Analyzers, or a new
> project?
>
> Analyzers sounds like the right space for it.
>
> > I'm currently experimenting with the build environment and making sure
> > that all tools work properly on my machine. However, I'm greeted with
> > several execution errors when executing a "build simple all release";
> > tests for SimpleFacetedSearch and SpellChecker calls non-existant
> > overload of IndexReader.Open and Memory tests have wrong assembly name
> > and output path. The build will proceed if I fix these errors, but some
> > tests fail. (one being TestQueryParser.TextWildCard with "Query
> > /term~0.7/ yielded /term~0.5/, expecting /term~0.7/"). These tests do
> > also fail in Resharpers unittest-runner.
>
> I'll try to take a look at this this week. We've mostly been focusing on
> the core, and I know that the contrib packages have started to fall to the
> wayside. We need to take a good look at them, make sure they are the right
> ports, and make the fixes to adjust to our api. If you do notice problems,
> I'd encourage you to at the very least throw up a JIRA issue. If it turns
> out it's not a problem, we can always close it.
> > I've tried "build commit all release" (from the build information wiki
> > page[3]) which fails with "NCover v3 does not appear to be installed".
> > This is correct; I've been unable to find a free version of a NCover v3.
> > Is the commit build target perhaps only meant for build servers?
>
> I think it was Michael who did all the work around the build system, I
> still build the old fashioned way... VS2010 - right click, build.
>
> > I've copied lib\StyleCop.4.5 to C:\Program Files
> > (x86)\MSBuild\StyleCop\v4.5 to remove the
> > stylecop-4.5-could-not-be-found warnings. I expect to get a gazillion
> > stylecop-related warnings when building (stylecop has never really liked
> > me), but get none at all. Is the code perfect, or are no rules applied?
>
> That's a good question, we discussed style cop at one point, but I don't
> think we every had a consensus on that.
>

RE: Plans for Hunspell integration (and: How do I build the trunk?)

Posted by Prescott Nasser <ge...@hotmail.com>.

> my intentions to move this code into contrib, which brings the first of 
> many questions; should it be added to Contrib.Analyzers, or a new project?

Analyzers sounds like the right space for it.

> I'm currently experimenting with the build environment and making sure 
> that all tools work properly on my machine. However, I'm greeted with 
> several execution errors when executing a "build simple all release"; 
> tests for SimpleFacetedSearch and SpellChecker calls non-existant 
> overload of IndexReader.Open and Memory tests have wrong assembly name 
> and output path. The build will proceed if I fix these errors, but some 
> tests fail. (one being TestQueryParser.TextWildCard with "Query 
> /term~0.7/ yielded /term~0.5/, expecting /term~0.7/"). These tests do 
> also fail in Resharpers unittest-runner.

I'll try to take a look at this this week. We've mostly been focusing on the core, and I know that the contrib packages have started to fall to the wayside. We need to take a good look at them, make sure they are the right ports, and make the fixes to adjust to our api. If you do notice problems, I'd encourage you to at the very least throw up a JIRA issue. If it turns out it's not a problem, we can always close it.
> I've tried "build commit all release" (from the build information wiki 
> page[3]) which fails with "NCover v3 does not appear to be installed". 
> This is correct; I've been unable to find a free version of a NCover v3. 
> Is the commit build target perhaps only meant for build servers?

I think it was Michael who did all the work around the build system, I still build the old fashioned way... VS2010 - right click, build.

> I've copied lib\StyleCop.4.5 to C:\Program Files 
> (x86)\MSBuild\StyleCop\v4.5 to remove the 
> stylecop-4.5-could-not-be-found warnings. I expect to get a gazillion 
> stylecop-related warnings when building (stylecop has never really liked 
> me), but get none at all. Is the code perfect, or are no rules applied?

That's a good question, we discussed style cop at one point, but I don't think we every had a consensus on that.