You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucenenet.apache.org by Stefan Bodewig <bo...@apache.org> on 2011/01/26 18:38:59 UTC

Stefan's Newbie Questions (was Re: Proposal Status, Initial Committors List, Contributors List)

[changing the subject to keep the other thread clean]

Thanks for your answers, Scott.

On 2011-01-26, Lombard, Scott wrote:

> On 2011-01-26, Stefan Bodewig wrote:

>>  * have you considered IKVM rather than a line-by-line translation?

> The end idea is to port Lucene with both an automated line-by-line and
> have a branch that provides a port to use .NET specific functions.  I
> assume that IKVM would not allow for the .NET centric version?

Not really.

In theory you can use ikvmc to compile the Java source files into a .NET
DLL that references some IKVM DLLs and an ikmvc'ed version of OpenJDK's
classlib.  After that it is a plain .NET DLL and one could write a .NET
centric API using that DLL.

I haven't really tried it on anything serious and it may become tricky
if reflection gets involved.  And there is some layer of indirection you
wouldn't have by a line by line translation that may lead to decreased
performance.  I'd be game to try it out, though.

>>  * what is the target C#/.NET version?

> Currently the project is a VS 2005 project using .net 2.0.
> https://issues.apache.org/jira/browse/LUCENENET-377 talks about
> changing the project to VS 2010 and it is still being resolved.

> As .NET features are being added the .NET version will be upgraded to
> meet the feature requests.

>>  * is Mono support a goal?

> Mono support is a goal.  Robert Jordan has committed to take the lead
> on that aspect.

I see, thank you.

> I will add my comments to a Wiki creating an FAQ page.

Yes, please do.

> Once I find the Wiki.

8-)

We can set up one once Lucene.NET is inside the Incubator again if there
isn't already one.

Thanks a lot

       Stefan

RE: Stefan's Newbie Questions (was Re: Proposal Status, Initial Committors List, Contributors List)

Posted by "Lombard, Scott" <sl...@KINGINDUSTRIES.COM>.
Can anyone put together a test case and attach it to https://issues.apache.org/jira/browse/LUCENENET-380 for evaluation?  It seems like it is worth evaluating.


Scott



-----Original Message-----
From: Hans Merkl [mailto:hm@hmerkl.com]
Sent: Wednesday, January 26, 2011 1:39 PM
To: lucene-net-dev@lucene.apache.org
Subject: Re: Stefan's Newbie Questions (was Re: Proposal Status, Initial Committors List, Contributors List)

A .NET wrapper around the IKVM classes may be a good idea.

I like the idea that IKVM would also allow use of tons of other useful
Java/Lucene code that's out there. There are some filters and analyzers in
Java that might be very useful for my work. That's not really possible with
the line-by-line port. It may be possible with Sharpen though.

On Wed, Jan 26, 2011 at 13:04, Digy <di...@gmail.com> wrote:

> In theory you can use ikvmc to compile the Java source files into a .NET
> DLL
> that references some IKVM DLLs and an ikmvc'ed version of OpenJDK's
> classlib.  After that it is a plain .NET DLL and one could write a .NET
> centric API using that DLL.
>
>
>
> I haven't really tried it on anything serious and it may become tricky if
> reflection gets involved.  And there is some layer of indirection you
> wouldn't have by a line by line translation that may lead to decreased
> performance.  I'd be game to try it out, though.
>
> ----
>
>
>
> A few yers ago, I tried IKVM with ~300M (200-300 bytes) documents. It was
> surprisingly as fast as Lucene.Net. That may mean that we should fix
> something in the code.
>
>
>
> Reflection is another nice thing in IKVM. You can even load and execute
> Java
> classes J
>
>
>
> DIGY
>
>
>
>


This message (and any associated files) is intended only for the
use of the individual or entity to which it is addressed and may
contain information that is confidential, subject to copyright or
constitutes a trade secret. If you are not the intended recipient
you are hereby notified that any dissemination, copying or
distribution of this message, or files associated with this message,
is strictly prohibited. If you have received this message in error,
please notify us immediately by replying to the message and deleting
it from your computer.  Thank you, King Industries, Inc.

Re: Stefan's Newbie Questions (was Re: Proposal Status, Initial Committors List, Contributors List)

Posted by Hans Merkl <hm...@hmerkl.com>.
That's only IF there is such a tradeoff. From what I have read IKVM is as
fast as Lucene.NET.

I agree with DIGY about not hijacking this thread. I think I have started
this.

On Wed, Jan 26, 2011 at 16:38, Nicholas Paldino [.NET/C# MVP] <
casperOne@caspershouse.com> wrote:

> As a consumer (and I think that most consumers would agree), I'd have to
> disagree STRONGLY on trading off performance for ease of conversion.
>
> Lucene and Lucene.NET is predicated on performance, compromising that, IMO,
> runs contrary to the core principals of Lucene.
>
>        - Nick
>
>

RE: Stefan's Newbie Questions (was Re: Proposal Status, Initial Committors List, Contributors List)

Posted by "Nicholas Paldino [.NET/C# MVP]" <ca...@caspershouse.com>.
As a consumer (and I think that most consumers would agree), I'd have to
disagree STRONGLY on trading off performance for ease of conversion.

Lucene and Lucene.NET is predicated on performance, compromising that, IMO,
runs contrary to the core principals of Lucene.

	- Nick

-----Original Message-----
From: Hans Merkl [mailto:hm@hmerkl.com] 
Sent: Wednesday, January 26, 2011 3:18 PM
To: lucene-net-dev@lucene.apache.org
Subject: Re: Stefan's Newbie Questions (was Re: Proposal Status, Initial
Committors List, Contributors List)

Personally I am willing to trade some performance for always being up to
date with the latest Java releases and also being able to use other Java
code. Although as far as I have seen most people say it's at the same speed
or even slightly faster than the .NET version.

I personally would be more likely to contribute to IKVM if there are any
issues since it would also benefit other Java code I use like TIKA. I wonder
if anybody has ever tried Lucene with IKVM in a heavy load production
environment. I use it in a one thread per index desktop app and wouldn't
notice if there were any issues under heavy load.

On Wed, Jan 26, 2011 at 14:01, Troy Howard <th...@gmail.com> wrote:

> I'm on the fence about IKVM.
>
> It has some significant benefits and some significant drawbacks:
>
> Benefits:
> - Allows us to get to a "commoditized" line-by-line .NET DLL in the
> fastest and easiest manner. No porting.
> - Reasonable performance profile
> - Well tested Java environment equivalence
>
> Drawbacks:
> - Blackbox, can't improve on it or tweak behaviour. If there are bugs
> or other issues, related to IKVM (ie thread safety, memory handling,
> etc) we can't fix those without dropping IKVM as our solution.
> - Adds an additional dependency
> - May not be the best possible performance profile. As DIGY said, it's
> roughly equivalent, but that doesn't mean that current Lucene.Net is
> fully optimized for .NET. In fact, it has been proven not to be by
> folks who have made custom builds/forks, realizing significant
> speedups using generics, etc..
>
> Also, that's a significant change in the library, which will introduce
> breaking API changes, and require us to beef up the unit tests to
> ensure that concerns like thread safety continue to behave as
> expected.
>
> Thanks,
> Troy
>
> On Wed, Jan 26, 2011 at 10:39 AM, Hans Merkl <hm...@hmerkl.com> wrote:
> > A .NET wrapper around the IKVM classes may be a good idea.
> >
> > I like the idea that IKVM would also allow use of tons of other useful
> > Java/Lucene code that's out there. There are some filters and analyzers
> in
> > Java that might be very useful for my work. That's not really possible
> with
> > the line-by-line port. It may be possible with Sharpen though.
> >
> > On Wed, Jan 26, 2011 at 13:04, Digy <di...@gmail.com> wrote:
> >
> >> In theory you can use ikvmc to compile the Java source files into a
.NET
> >> DLL
> >> that references some IKVM DLLs and an ikmvc'ed version of OpenJDK's
> >> classlib.  After that it is a plain .NET DLL and one could write a .NET
> >> centric API using that DLL.
> >>
> >>
> >>
> >> I haven't really tried it on anything serious and it may become tricky
> if
> >> reflection gets involved.  And there is some layer of indirection you
> >> wouldn't have by a line by line translation that may lead to decreased
> >> performance.  I'd be game to try it out, though.
> >>
> >> ----
> >>
> >>
> >>
> >> A few yers ago, I tried IKVM with ~300M (200-300 bytes) documents. It
> was
> >> surprisingly as fast as Lucene.Net. That may mean that we should fix
> >> something in the code.
> >>
> >>
> >>
> >> Reflection is another nice thing in IKVM. You can even load and execute
> >> Java
> >> classes J
> >>
> >>
> >>
> >> DIGY
> >>
> >>
> >>
> >>
> >
>




RE: Stefan's Newbie Questions (was Re: Proposal Status, Initial Committors List, Contributors List)

Posted by Digy <di...@gmail.com>.
I think we should discuss it later not to hijack this thread..
DIGY

-----Original Message-----
From: Hans Merkl [mailto:hm@hmerkl.com] 
Sent: Wednesday, January 26, 2011 11:29 PM
To: lucene-net-dev@lucene.apache.org
Subject: Re: Stefan's Newbie Questions (was Re: Proposal Status, Initial
Committors List, Contributors List)

I use it in my desktop app and it works great. My app gets distributed and
with TIKA I know what I get on the customer machine. With IFilter you never
know...

On Wed, Jan 26, 2011 at 15:55, Digy <di...@gmail.com> wrote:

> Although TIKA is a very good project, I've never needed it in windows
> environment. Using Ifilter interface solves most(if not all) of the
> problems
> related with converting a document to text.
>
> DIGY
>
>
>


Re: Stefan's Newbie Questions (was Re: Proposal Status, Initial Committors List, Contributors List)

Posted by Hans Merkl <hm...@hmerkl.com>.
I use it in my desktop app and it works great. My app gets distributed and
with TIKA I know what I get on the customer machine. With IFilter you never
know...

On Wed, Jan 26, 2011 at 15:55, Digy <di...@gmail.com> wrote:

> Although TIKA is a very good project, I've never needed it in windows
> environment. Using Ifilter interface solves most(if not all) of the
> problems
> related with converting a document to text.
>
> DIGY
>
>
>

RE: Stefan's Newbie Questions (was Re: Proposal Status, Initial Committors List, Contributors List)

Posted by Digy <di...@gmail.com>.
Although TIKA is a very good project, I've never needed it in windows
environment. Using Ifilter interface solves most(if not all) of the problems
related with converting a document to text.

DIGY


-----Original Message-----
From: Hans Merkl [mailto:hm@hmerkl.com] 
Sent: Wednesday, January 26, 2011 10:18 PM
To: lucene-net-dev@lucene.apache.org
Subject: Re: Stefan's Newbie Questions (was Re: Proposal Status, Initial
Committors List, Contributors List)

Personally I am willing to trade some performance for always being up to
date with the latest Java releases and also being able to use other Java
code. Although as far as I have seen most people say it's at the same speed
or even slightly faster than the .NET version.

I personally would be more likely to contribute to IKVM if there are any
issues since it would also benefit other Java code I use like TIKA. I wonder
if anybody has ever tried Lucene with IKVM in a heavy load production
environment. I use it in a one thread per index desktop app and wouldn't
notice if there were any issues under heavy load.

On Wed, Jan 26, 2011 at 14:01, Troy Howard <th...@gmail.com> wrote:

> I'm on the fence about IKVM.
>
> It has some significant benefits and some significant drawbacks:
>
> Benefits:
> - Allows us to get to a "commoditized" line-by-line .NET DLL in the
> fastest and easiest manner. No porting.
> - Reasonable performance profile
> - Well tested Java environment equivalence
>
> Drawbacks:
> - Blackbox, can't improve on it or tweak behaviour. If there are bugs
> or other issues, related to IKVM (ie thread safety, memory handling,
> etc) we can't fix those without dropping IKVM as our solution.
> - Adds an additional dependency
> - May not be the best possible performance profile. As DIGY said, it's
> roughly equivalent, but that doesn't mean that current Lucene.Net is
> fully optimized for .NET. In fact, it has been proven not to be by
> folks who have made custom builds/forks, realizing significant
> speedups using generics, etc..
>
> Also, that's a significant change in the library, which will introduce
> breaking API changes, and require us to beef up the unit tests to
> ensure that concerns like thread safety continue to behave as
> expected.
>
> Thanks,
> Troy
>
> On Wed, Jan 26, 2011 at 10:39 AM, Hans Merkl <hm...@hmerkl.com> wrote:
> > A .NET wrapper around the IKVM classes may be a good idea.
> >
> > I like the idea that IKVM would also allow use of tons of other useful
> > Java/Lucene code that's out there. There are some filters and analyzers
> in
> > Java that might be very useful for my work. That's not really possible
> with
> > the line-by-line port. It may be possible with Sharpen though.
> >
> > On Wed, Jan 26, 2011 at 13:04, Digy <di...@gmail.com> wrote:
> >
> >> In theory you can use ikvmc to compile the Java source files into a
.NET
> >> DLL
> >> that references some IKVM DLLs and an ikmvc'ed version of OpenJDK's
> >> classlib.  After that it is a plain .NET DLL and one could write a .NET
> >> centric API using that DLL.
> >>
> >>
> >>
> >> I haven't really tried it on anything serious and it may become tricky
> if
> >> reflection gets involved.  And there is some layer of indirection you
> >> wouldn't have by a line by line translation that may lead to decreased
> >> performance.  I'd be game to try it out, though.
> >>
> >> ----
> >>
> >>
> >>
> >> A few yers ago, I tried IKVM with ~300M (200-300 bytes) documents. It
> was
> >> surprisingly as fast as Lucene.Net. That may mean that we should fix
> >> something in the code.
> >>
> >>
> >>
> >> Reflection is another nice thing in IKVM. You can even load and execute
> >> Java
> >> classes J
> >>
> >>
> >>
> >> DIGY
> >>
> >>
> >>
> >>
> >
>


Re: Stefan's Newbie Questions (was Re: Proposal Status, Initial Committors List, Contributors List)

Posted by Robert Muir <rc...@gmail.com>.
On Wed, Jan 26, 2011 at 3:18 PM, Hans Merkl <hm...@hmerkl.com> wrote:
> Personally I am willing to trade some performance for always being up to
> date with the latest Java releases and also being able to use other Java
> code. Although as far as I have seen most people say it's at the same speed
> or even slightly faster than the .NET version.
>
> I personally would be more likely to contribute to IKVM if there are any
> issues since it would also benefit other Java code I use like TIKA. I wonder
> if anybody has ever tried Lucene with IKVM in a heavy load production
> environment. I use it in a one thread per index desktop app and wouldn't
> notice if there were any issues under heavy load.
>

>From the java camp, not too long ago i took lucene's trunk (to be 4.0)
and ran the unit test suite with IKVM...
As someone who has tested alternative Java implementations with lucene
(e.g. Jrockit, IBM's J9, Harmony), I would only call the results very
impressive.

In fact I only had one test failure, in TestIndexWriterExceptions,
which in my opinion isn't serious at all, its to test this:
https://issues.apache.org/jira/browse/LUCENE-1214

I think its likely due to the way that IKVM optimizes exceptions[1],
and our unit test does wierd things with examining stacktraces and
such to simulate the condition... in other words not a real problem.

I would suggest benchmarking!

[1] http://weblog.ikvm.net/PermaLink.aspx?guid=388b2a6d-e7b2-4ffa-86e7-450c87e6178f

Re: Stefan's Newbie Questions (was Re: Proposal Status, Initial Committors List, Contributors List)

Posted by Hans Merkl <hm...@hmerkl.com>.
Personally I am willing to trade some performance for always being up to
date with the latest Java releases and also being able to use other Java
code. Although as far as I have seen most people say it's at the same speed
or even slightly faster than the .NET version.

I personally would be more likely to contribute to IKVM if there are any
issues since it would also benefit other Java code I use like TIKA. I wonder
if anybody has ever tried Lucene with IKVM in a heavy load production
environment. I use it in a one thread per index desktop app and wouldn't
notice if there were any issues under heavy load.

On Wed, Jan 26, 2011 at 14:01, Troy Howard <th...@gmail.com> wrote:

> I'm on the fence about IKVM.
>
> It has some significant benefits and some significant drawbacks:
>
> Benefits:
> - Allows us to get to a "commoditized" line-by-line .NET DLL in the
> fastest and easiest manner. No porting.
> - Reasonable performance profile
> - Well tested Java environment equivalence
>
> Drawbacks:
> - Blackbox, can't improve on it or tweak behaviour. If there are bugs
> or other issues, related to IKVM (ie thread safety, memory handling,
> etc) we can't fix those without dropping IKVM as our solution.
> - Adds an additional dependency
> - May not be the best possible performance profile. As DIGY said, it's
> roughly equivalent, but that doesn't mean that current Lucene.Net is
> fully optimized for .NET. In fact, it has been proven not to be by
> folks who have made custom builds/forks, realizing significant
> speedups using generics, etc..
>
> Also, that's a significant change in the library, which will introduce
> breaking API changes, and require us to beef up the unit tests to
> ensure that concerns like thread safety continue to behave as
> expected.
>
> Thanks,
> Troy
>
> On Wed, Jan 26, 2011 at 10:39 AM, Hans Merkl <hm...@hmerkl.com> wrote:
> > A .NET wrapper around the IKVM classes may be a good idea.
> >
> > I like the idea that IKVM would also allow use of tons of other useful
> > Java/Lucene code that's out there. There are some filters and analyzers
> in
> > Java that might be very useful for my work. That's not really possible
> with
> > the line-by-line port. It may be possible with Sharpen though.
> >
> > On Wed, Jan 26, 2011 at 13:04, Digy <di...@gmail.com> wrote:
> >
> >> In theory you can use ikvmc to compile the Java source files into a .NET
> >> DLL
> >> that references some IKVM DLLs and an ikmvc'ed version of OpenJDK's
> >> classlib.  After that it is a plain .NET DLL and one could write a .NET
> >> centric API using that DLL.
> >>
> >>
> >>
> >> I haven't really tried it on anything serious and it may become tricky
> if
> >> reflection gets involved.  And there is some layer of indirection you
> >> wouldn't have by a line by line translation that may lead to decreased
> >> performance.  I'd be game to try it out, though.
> >>
> >> ----
> >>
> >>
> >>
> >> A few yers ago, I tried IKVM with ~300M (200-300 bytes) documents. It
> was
> >> surprisingly as fast as Lucene.Net. That may mean that we should fix
> >> something in the code.
> >>
> >>
> >>
> >> Reflection is another nice thing in IKVM. You can even load and execute
> >> Java
> >> classes J
> >>
> >>
> >>
> >> DIGY
> >>
> >>
> >>
> >>
> >
>

Re: Stefan's Newbie Questions (was Re: Proposal Status, Initial Committors List, Contributors List)

Posted by Troy Howard <th...@gmail.com>.
I'm on the fence about IKVM.

It has some significant benefits and some significant drawbacks:

Benefits:
- Allows us to get to a "commoditized" line-by-line .NET DLL in the
fastest and easiest manner. No porting.
- Reasonable performance profile
- Well tested Java environment equivalence

Drawbacks:
- Blackbox, can't improve on it or tweak behaviour. If there are bugs
or other issues, related to IKVM (ie thread safety, memory handling,
etc) we can't fix those without dropping IKVM as our solution.
- Adds an additional dependency
- May not be the best possible performance profile. As DIGY said, it's
roughly equivalent, but that doesn't mean that current Lucene.Net is
fully optimized for .NET. In fact, it has been proven not to be by
folks who have made custom builds/forks, realizing significant
speedups using generics, etc..

Also, that's a significant change in the library, which will introduce
breaking API changes, and require us to beef up the unit tests to
ensure that concerns like thread safety continue to behave as
expected.

Thanks,
Troy

On Wed, Jan 26, 2011 at 10:39 AM, Hans Merkl <hm...@hmerkl.com> wrote:
> A .NET wrapper around the IKVM classes may be a good idea.
>
> I like the idea that IKVM would also allow use of tons of other useful
> Java/Lucene code that's out there. There are some filters and analyzers in
> Java that might be very useful for my work. That's not really possible with
> the line-by-line port. It may be possible with Sharpen though.
>
> On Wed, Jan 26, 2011 at 13:04, Digy <di...@gmail.com> wrote:
>
>> In theory you can use ikvmc to compile the Java source files into a .NET
>> DLL
>> that references some IKVM DLLs and an ikmvc'ed version of OpenJDK's
>> classlib.  After that it is a plain .NET DLL and one could write a .NET
>> centric API using that DLL.
>>
>>
>>
>> I haven't really tried it on anything serious and it may become tricky if
>> reflection gets involved.  And there is some layer of indirection you
>> wouldn't have by a line by line translation that may lead to decreased
>> performance.  I'd be game to try it out, though.
>>
>> ----
>>
>>
>>
>> A few yers ago, I tried IKVM with ~300M (200-300 bytes) documents. It was
>> surprisingly as fast as Lucene.Net. That may mean that we should fix
>> something in the code.
>>
>>
>>
>> Reflection is another nice thing in IKVM. You can even load and execute
>> Java
>> classes J
>>
>>
>>
>> DIGY
>>
>>
>>
>>
>

Re: Stefan's Newbie Questions (was Re: Proposal Status, Initial Committors List, Contributors List)

Posted by Hans Merkl <hm...@hmerkl.com>.
A .NET wrapper around the IKVM classes may be a good idea.

I like the idea that IKVM would also allow use of tons of other useful
Java/Lucene code that's out there. There are some filters and analyzers in
Java that might be very useful for my work. That's not really possible with
the line-by-line port. It may be possible with Sharpen though.

On Wed, Jan 26, 2011 at 13:04, Digy <di...@gmail.com> wrote:

> In theory you can use ikvmc to compile the Java source files into a .NET
> DLL
> that references some IKVM DLLs and an ikmvc'ed version of OpenJDK's
> classlib.  After that it is a plain .NET DLL and one could write a .NET
> centric API using that DLL.
>
>
>
> I haven't really tried it on anything serious and it may become tricky if
> reflection gets involved.  And there is some layer of indirection you
> wouldn't have by a line by line translation that may lead to decreased
> performance.  I'd be game to try it out, though.
>
> ----
>
>
>
> A few yers ago, I tried IKVM with ~300M (200-300 bytes) documents. It was
> surprisingly as fast as Lucene.Net. That may mean that we should fix
> something in the code.
>
>
>
> Reflection is another nice thing in IKVM. You can even load and execute
> Java
> classes J
>
>
>
> DIGY
>
>
>
>

RE: Stefan's Newbie Questions (was Re: Proposal Status, Initial Committors List, Contributors List)

Posted by Digy <di...@gmail.com>.
In theory you can use ikvmc to compile the Java source files into a .NET DLL
that references some IKVM DLLs and an ikmvc'ed version of OpenJDK's
classlib.  After that it is a plain .NET DLL and one could write a .NET
centric API using that DLL.

 

I haven't really tried it on anything serious and it may become tricky if
reflection gets involved.  And there is some layer of indirection you
wouldn't have by a line by line translation that may lead to decreased
performance.  I'd be game to try it out, though.

----

 

A few yers ago, I tried IKVM with ~300M (200-300 bytes) documents. It was
surprisingly as fast as Lucene.Net. That may mean that we should fix
something in the code. 

 

Reflection is another nice thing in IKVM. You can even load and execute Java
classes J

 

DIGY