You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucenenet.apache.org by Troy Howard <th...@gmail.com> on 2011/01/01 01:50:11 UTC

Re: Proposal Stage: Net Idiomatic Api Version

I agree with the suggestion to follow the MS Coding standard. It's a
good general guideline. Specifically, I'd like to follow all
guidelines put forth in the book:

Framework Design Guidelines: Conventions, Idioms, and Patterns for
Reusable .NET Libraries by Krzysztof Cwalina and Brad Abrams
http://amzn.com/0321246756

There's also a lecture that Krysztof gave that's available as a
offline video download here (the streaming version isn't available at
the moment for some reason):

http://download.microsoft.com/download/8/0/8/808412ec-2561-413d-a9e3-5cd47d37d763/FDGNetCast.zip


With regards to the specifics of the API, I think we should try to
bring together the existing forks (Lucere, Lucille, and Aimee.Net) and
attempt to merge them into a single consistent alternative API for
Lucene.Net. They all use similar but slightly different tactics to
".NETify" the codebase.

Also, significant community feedback will be necessary before we
proceed to far down that road. We'll have a lot of work ahead of us
just getting up to date releases finished for the 1:1 API port. It's
my opinion though, that these can be separate and parallel development
efforts.

I made a request of the community in the Lucere project mailing list
to respond with ideas about what an ideal .NET API would look like,
and how it would function. Specifically, I was hoping to get
pseudo-code examples of how end users would like to use Lucene. Even
something as simple as:

using(var luceneIndex = new LuceneIndex.Open("C:\foo\bar"))
{
  var hitDocs = from doc in luceneIndex where
doc.Field["content"].Match("foo") select doc;
}

This represents a lot of ideas all in one little code snippet. Maybe
this isn't an ideal API, maybe it is... If we collect a bunch of code
samples from people like this, we can discuss the merits of various
ideas for the API and settle on an ideal way to present the
functionality of the library in a way that will integrate well with
the .NET 3.5/4.0 environment.

I didn't get a lot of responses in the Lucere mailing list but perhaps
the Lucene.Net community will have some ideas. We should probably
cross-post to the lucene-net-user mailing list with a request for
ideas.

Thanks,
Troy


On Fri, Dec 31, 2010 at 1:35 PM, Michael Herndon <mh...@o19s.com> wrote:
> *Net Idiomatic Api Version*
> *We should probably be looking for with this criteria is readability &
> getting people familiar with any new code base faster within their own
> Idiom.  *
> *
> *
> Starting with a proposal that we use the internal Ms coding
> guidelines<http://blogs.msdn.com/b/brada/archive/2005/01/26/361363.aspx>
> for
> the idiomatic version, not to make anyone's life miserable or coding less
> enjoyable or anything.
>
> But its already documented, we can easily point to it without having to
> write up our own guidelines, and everyone who works inside of .net should be
> remotely familiar with it, meaning someone can just come in and crank out
> code.
>
> If need be, we let people work on the code base in their own style and when
> they are done working on a particular area, let them reformat it or just run
> a tool that auto formats code before each release.
>
> I know their is religious wars fought over this stuff, I don't want to
> create one.  I could be wrong about the above, but what again, the goals
> should be familiarity, comfort, creating a bigger community.
>
> Also uses of core Interfaces, Annotations, & Classes where possible.  (What
> are some of these that you would like to see other than IDisposable?)
>
> A good book to comb over with the latest edition is the  "Framework Design
> Guidelines" 2nd edition.
>
> *
> *
>
>
> --
> Michael Herndon
>

RE: Proposal Stage: Net Idiomatic Api Version

Posted by "Nicholas Paldino [.NET/C# MVP]" <ca...@caspershouse.com>.
Peter,

	I was just about to comment on this (with the same links mind you).
The ^proper^ implementation of IDisposable where Closable is used is the
best approach.

	To Robert, just because IDisposable is implemented doesn't mean that
it ^must^ be used in a using statement (which is not the same as finally,
but has some similarities).  I've implemented wrappers around anything that
exposes a Close method in an older version of Lucene.NET so that I can use
IDisposable on them, including IndexReader.

	However, I choose not to use IndexReader in a using statement, I
open mine and store a reference to it.  The great thing about implementing
IDisposable properly is that one would have a finalizer as well which would
call Dispose (rather, the protected overload) in the event that the app
domain is torn down or the object reference is let go without Dispose being
called on it (I actually think the latter case is a design flaw on the part
of the consumer of the object, and would rather see an exception thrown in
this case, but you can't control other people's implementations, and
everyone follows the current guidelines).

	From the .NET perspective, the IndexReader/Writer will most
definitely have to implement IDisposable, as it will contain references to
other IDisposable implementations (Directory) which should also be disposed
of when Dispose is called.

	It should be noted that there is a difference between an
implementation of the IDisposable interface and a Close method.  In the .NET
world, it's a given (though not codified through an interface or guideline)
that if you have a Close method, the following applies (again, not all the
time, but you can see it in practice):

- There is an Open method
- Dispose calls Close

	This way, you can use the resource in a using statement and it will
close the resource properly.  However, if you wanted, you could close the
resource and then reopen it with the Close/Open methods.  The most prominent
case of this is the DbConnection class
(http://msdn.microsoft.com/en-us/library/system.data.common.dbconnection.asp
x) which serves as the base class for all database connections in ADO.NET.

	To Karell, sorry, but I'm going to have to snipe.  In regards to the
"quirks" you mention with the CLR GC, and having to free resources by
calling IDisposable in the face of that (and then saying "in a world where
GC exist, who would have thought"), the same exact situation exists with the
JVM.

	The CLR and the JVM GC are meant to deal with ^managed^ resources
and memory.  This means that when you allocate space from the heap, etc, it
can keep track of it.  However, when dealing with managed wrappers to
unmanaged resources, one needs to have a definitive way of releasing those
resources; in the unmanaged world, it was assumed those resources were
disposed of as soon as their use was complete (typically through RIIA), the
managed world does not make any guarantees, hence the need for IDisposable
in .NET and Closable in Java.

	The point is, the situation that you say is a "quirk" with .NET is
directly related to the fact that .NET has to interact with code that is not
managed, and Java shares the exact situation, with the same caveats and
"quirks".

		- Nicholas Paldino [.NET/C# MVP]
		- casperOne@caspershouse.com

-----Original Message-----
From: Peter Mateja [mailto:peter.mateja@gmail.com] 
Sent: Tuesday, January 04, 2011 12:15 PM
To: lucene-net-dev@lucene.apache.org
Subject: Re: Proposal Stage: Net Idiomatic Api Version

Robert... good points all.  I especially agree that basing initial idiomatic
work on 3.0+ makes sense (indeed, I believe this is what Lucere.Net had
agreed to do.)

Use of IDisposable can certainly lead to worst practices concerning
IndexReader / IndexWriter objects.  However, the IDisposable pattern
(if implemented correctly... see
http://msdn.microsoft.com/en-us/library/b1yfkh5e.aspx,
http://www.codeproject.com/KB/dotnet/idisposable.aspx and Framework Design
Patterns book mentioned earlier), really is the best way (in .Net) to ensure
proper handling of both unmanaged resources, and stateful managed resources.

I think a good combination of documentation and examples could do much to
discourage worst practices.  In some cases, the sample 'using' code you
refer to might be appropriate... though in most the lifetime of an
IndexWriter object might be controlled at a higher context (AppDomain, etc.)
 Let's ensure that Lucene.Net users know the how and why for each approach.

Peter Mateja
peter.mateja@gmail.com



On Tue, Jan 4, 2011 at 10:41 AM, Karell Ste-Marie
<st...@brain-bank.com>wrote:

> Robert,
>
> Thanks for stepping in,
>
> I personally found some of your suggestions quite interesting and
> completely agree that Lucene 3.0 may help quite a bit.
>
> Not that I want to place myself in the bullseye of any .NET snipers out
> there but the .NET framework (like any others) has its share of quirks.
The
> main one that comes to mind is the garbage collector which is different
than
> in Java. The same can be said for some of the behavior of the CLR when
> compared to the JVM. I recall implementing IDisposable myself in a few
> objects and while you may consider that the GC should run and free
resources
> by calling Dispose on an IDisposable object this is actually a technique
> that is discouraged because there is no telling when the GC will actually
> free up a resource - you may laugh at this but when it comes to bad
> practices I've seen newbie .NET programmers easily create memory leaks by
> not manually closing resources (in a world where a GC exist, who would
have
> thought...)
>
> In the end, the languages are different - the whole conversation about
> Generics could also be a very interesting topic, we could also talk about
> WCF quite a bit. This is where personally the line-to-line port of Lucene
> from Java to .NET is IMHO a difficult endeavor. One would not try to do a
> line-to-line port from Java to Perl. The languages are too different. But
I
> think that because people perceive similarities between C# and Java that
it
> is assumed that it's a good idea. However - and this is where opinions
> diverge - this would be in my point of view like trying to fit a gas
engine
> in a diesel car. While the purposes are the same, the implementations
should
> be different (at least in some areas) because the technologies are
> different.
>
> My Canadian 2 cents - subject to the exchange rate
>
>
> Karell Ste-Marie
> C.I.O. - BrainBank Inc
>
> -----Original Message-----
> From: Robert Muir [mailto:rcmuir@gmail.com]
> Sent: Tuesday, January 04, 2011 11:27 AM
> To: lucene-net-dev@lucene.apache.org
> Subject: Re: Proposal Stage: Net Idiomatic Api Version
>
> On Tue, Jan 4, 2011 at 10:49 AM, Peter Mateja <pe...@gmail.com>
> wrote:
> >> I made a request of the community in the Lucere project mailing list
> >> to respond with ideas about what an ideal .NET API would look like,
> >> and how it would function. Specifically, I was hoping to get
> >> pseudo-code examples of how end users would like to use Lucene. Even
> >> something as simple as:
> >>
> >> using(var luceneIndex = new LuceneIndex.Open("C:\foo\bar")) {
> >>  var hitDocs = from doc in luceneIndex where
> >> doc.Field["content"].Match("foo") select doc; }
>
> Hi guys, I know almost nothing of .NET (lucene java developer here), but I
> was hoping I could provide some suggestions here to help out.
>
> In glancing at some of the issues surrounding a more ".NET" api, i
couldn't
> help but notice many of the issues people complain about, are because
> lucene.net hasn't implemented lucene 3.0
>
> # lucene 3.0.x is the same feature-wise as lucene 2.9.x, no new features.
> # lucene 3.0.x is Java 5, which is almost a different programming language
> than Java 4 (2.9.x). This means enums, generics, Closeable, foreach
(instead
> of Iterators), autoboxing, annotations, etc.
>
> A lot of the issues people have raised seem to be due to the fact that
> lucene 2.9.x is in an ancient java version... I think if you ported 3.0,
> things would look a lot more idiomatic (although surely not perfect for
.NET
> users, but better!).
>
> For example, taking a glance, I people making the .NET forks actually
doing
> things like taking the 2.9.x code and converting Field.java to use enums,
> which is really a duplication of effort since we did this in java over a
> year ago!:
>
>
>
http://svn.apache.org/repos/asf/lucene/java/branches/lucene_3_0/src/java/org
/apache/lucene/document/Field.java
>
> So, I'm suggesting that one thing you could consider is to start focusing
> on lucene 3.0.x, to also produce a more idiomatic api automatically, and
> possibly this would be a good enough improvement to bring in some
developers
> from those forks.
>
> Separately, I'm trying to understand the syntax you provided about
> IDisposable/using. Obviously, as part of your porting process you could
take
> anything marked Closeable [we marked anything wtih a
> .close() as Closeable in Lucene 3.0], and mark it IDisposable, but is this
> really the best approach?
>
> For example, the syntax you provided seems like it would encourage closing
> the IndexReader and opening a new one for each search request... yet this
is
> about the biggest no-no you can do with a lucene index... opening a new
> IndexReader is very heavy and you should re-use it across queries and not
> open/close them often... so in this case, a more idiomatic API could
> actually be bad, if it encourages worst practices...
>




Re: Proposal Stage: Net Idiomatic Api Version

Posted by Wyatt Barnett <wy...@gmail.com>.
Anyone else get the feeling that Java and C# are kind of like British
and American english -- two people separated by a nearly common
language?

As for inspiration, RavenDb might be a good place to start. It has
native Lucene querying capabilities and the API is quite sexy and has
a number of good people designing and beating on it.

On Tue, Jan 4, 2011 at 7:44 PM, Prescott Nasser <ge...@hotmail.com> wrote:
>
> I think good documentation, examples that have best practices is key to
> fostering a good Lucene.Net community. No question in my mind that we would
> do this.
> ~Prescott
>
>
>
>> Subject: RE: Proposal Stage: Net Idiomatic Api Version
>> Date: Tue, 4 Jan 2011 12:32:45 -0500
>> From: stemarie@brain-bank.com
>> To: lucene-net-dev@lucene.apache.org
>>
>> Peter,
>>
>> I completely agree - upon reading my last post it lacked a critical
>> component to actually bring some value to the conversation which you
>> mentioned. The USING keyword is key, perhaps as Robert mentioned it may
>> not be in the best of lights given the context of the example but that
>> is indeed how it should be used in the .NET framework.
>>
>> Perhaps the documentation for Lucene.NET can include examples that
>> demonstrate the use of some of the expensive classes implemented as
>> Singletons - perhaps even code that up for the client as part of the
>> library itself (or in code examples). Clumsy coders would then not be
>> able to "mess up" the performance of Lucene.NET as much as they could
>> given their broad control over some of these objects and their lifetime.
>>
>>
>>
>> Karell Ste-Marie
>> C.I.O. - BrainBank Inc
>>
>>
>> -----Original Message-----
>> From: Peter Mateja [mailto:peter.mateja@gmail.com]
>> Sent: Tuesday, January 04, 2011 12:15 PM
>> To: lucene-net-dev@lucene.apache.org
>> Subject: Re: Proposal Stage: Net Idiomatic Api Version
>>
>> Robert... good points all.  I especially agree that basing initial
>> idiomatic work on 3.0+ makes sense (indeed, I believe this is what
>> Lucere.Net had agreed to do.)
>>
>> Use of IDisposable can certainly lead to worst practices concerning
>> IndexReader / IndexWriter objects.  However, the IDisposable pattern (if
>> implemented correctly... see
>> http://msdn.microsoft.com/en-us/library/b1yfkh5e.aspx,
>> http://www.codeproject.com/KB/dotnet/idisposable.aspx and Framework
>> Design Patterns book mentioned earlier), really is the best way (in
>> .Net) to ensure proper handling of both unmanaged resources, and
>> stateful managed resources.
>>
>> I think a good combination of documentation and examples could do much
>> to discourage worst practices.  In some cases, the sample 'using' code
>> you refer to might be appropriate... though in most the lifetime of an
>> IndexWriter object might be controlled at a higher context (AppDomain,
>> etc.)  Let's ensure that Lucene.Net users know the how and why for each
>> approach.
>>
>> Peter Mateja
>> peter.mateja@gmail.com
>

RE: Proposal Stage: Net Idiomatic Api Version

Posted by Prescott Nasser <ge...@hotmail.com>.
I think good documentation, examples that have best practices is key to fostering a good Lucene.Net community. No question in my mind that we would do this.
~Prescott



> Subject: RE: Proposal Stage: Net Idiomatic Api Version
> Date: Tue, 4 Jan 2011 12:32:45 -0500
> From: stemarie@brain-bank.com
> To: lucene-net-dev@lucene.apache.org
> 
> Peter,
> 
> I completely agree - upon reading my last post it lacked a critical
> component to actually bring some value to the conversation which you
> mentioned. The USING keyword is key, perhaps as Robert mentioned it may
> not be in the best of lights given the context of the example but that
> is indeed how it should be used in the .NET framework.
> 
> Perhaps the documentation for Lucene.NET can include examples that
> demonstrate the use of some of the expensive classes implemented as
> Singletons - perhaps even code that up for the client as part of the
> library itself (or in code examples). Clumsy coders would then not be
> able to "mess up" the performance of Lucene.NET as much as they could
> given their broad control over some of these objects and their lifetime.
> 
> 
> 
> Karell Ste-Marie
> C.I.O. - BrainBank Inc
> 
> 
> -----Original Message-----
> From: Peter Mateja [mailto:peter.mateja@gmail.com] 
> Sent: Tuesday, January 04, 2011 12:15 PM
> To: lucene-net-dev@lucene.apache.org
> Subject: Re: Proposal Stage: Net Idiomatic Api Version
> 
> Robert... good points all.  I especially agree that basing initial
> idiomatic work on 3.0+ makes sense (indeed, I believe this is what
> Lucere.Net had agreed to do.)
> 
> Use of IDisposable can certainly lead to worst practices concerning
> IndexReader / IndexWriter objects.  However, the IDisposable pattern (if
> implemented correctly... see
> http://msdn.microsoft.com/en-us/library/b1yfkh5e.aspx,
> http://www.codeproject.com/KB/dotnet/idisposable.aspx and Framework
> Design Patterns book mentioned earlier), really is the best way (in
> .Net) to ensure proper handling of both unmanaged resources, and
> stateful managed resources.
> 
> I think a good combination of documentation and examples could do much
> to discourage worst practices.  In some cases, the sample 'using' code
> you refer to might be appropriate... though in most the lifetime of an
> IndexWriter object might be controlled at a higher context (AppDomain,
> etc.)  Let's ensure that Lucene.Net users know the how and why for each
> approach.
> 
> Peter Mateja
> peter.mateja@gmail.com
 		 	   		  

RE: Proposal Stage: Net Idiomatic Api Version

Posted by Karell Ste-Marie <st...@brain-bank.com>.
Peter,

I completely agree - upon reading my last post it lacked a critical
component to actually bring some value to the conversation which you
mentioned. The USING keyword is key, perhaps as Robert mentioned it may
not be in the best of lights given the context of the example but that
is indeed how it should be used in the .NET framework.

Perhaps the documentation for Lucene.NET can include examples that
demonstrate the use of some of the expensive classes implemented as
Singletons - perhaps even code that up for the client as part of the
library itself (or in code examples). Clumsy coders would then not be
able to "mess up" the performance of Lucene.NET as much as they could
given their broad control over some of these objects and their lifetime.



Karell Ste-Marie
C.I.O. - BrainBank Inc


-----Original Message-----
From: Peter Mateja [mailto:peter.mateja@gmail.com] 
Sent: Tuesday, January 04, 2011 12:15 PM
To: lucene-net-dev@lucene.apache.org
Subject: Re: Proposal Stage: Net Idiomatic Api Version

Robert... good points all.  I especially agree that basing initial
idiomatic work on 3.0+ makes sense (indeed, I believe this is what
Lucere.Net had agreed to do.)

Use of IDisposable can certainly lead to worst practices concerning
IndexReader / IndexWriter objects.  However, the IDisposable pattern (if
implemented correctly... see
http://msdn.microsoft.com/en-us/library/b1yfkh5e.aspx,
http://www.codeproject.com/KB/dotnet/idisposable.aspx and Framework
Design Patterns book mentioned earlier), really is the best way (in
.Net) to ensure proper handling of both unmanaged resources, and
stateful managed resources.

I think a good combination of documentation and examples could do much
to discourage worst practices.  In some cases, the sample 'using' code
you refer to might be appropriate... though in most the lifetime of an
IndexWriter object might be controlled at a higher context (AppDomain,
etc.)  Let's ensure that Lucene.Net users know the how and why for each
approach.

Peter Mateja
peter.mateja@gmail.com

Re: Proposal Stage: Net Idiomatic Api Version

Posted by Peter Mateja <pe...@gmail.com>.
Robert... good points all.  I especially agree that basing initial idiomatic
work on 3.0+ makes sense (indeed, I believe this is what Lucere.Net had
agreed to do.)

Use of IDisposable can certainly lead to worst practices concerning
IndexReader / IndexWriter objects.  However, the IDisposable pattern
(if implemented correctly... see
http://msdn.microsoft.com/en-us/library/b1yfkh5e.aspx,
http://www.codeproject.com/KB/dotnet/idisposable.aspx and Framework Design
Patterns book mentioned earlier), really is the best way (in .Net) to ensure
proper handling of both unmanaged resources, and stateful managed resources.

I think a good combination of documentation and examples could do much to
discourage worst practices.  In some cases, the sample 'using' code you
refer to might be appropriate... though in most the lifetime of an
IndexWriter object might be controlled at a higher context (AppDomain, etc.)
 Let's ensure that Lucene.Net users know the how and why for each approach.

Peter Mateja
peter.mateja@gmail.com



On Tue, Jan 4, 2011 at 10:41 AM, Karell Ste-Marie
<st...@brain-bank.com>wrote:

> Robert,
>
> Thanks for stepping in,
>
> I personally found some of your suggestions quite interesting and
> completely agree that Lucene 3.0 may help quite a bit.
>
> Not that I want to place myself in the bullseye of any .NET snipers out
> there but the .NET framework (like any others) has its share of quirks. The
> main one that comes to mind is the garbage collector which is different than
> in Java. The same can be said for some of the behavior of the CLR when
> compared to the JVM. I recall implementing IDisposable myself in a few
> objects and while you may consider that the GC should run and free resources
> by calling Dispose on an IDisposable object this is actually a technique
> that is discouraged because there is no telling when the GC will actually
> free up a resource - you may laugh at this but when it comes to bad
> practices I've seen newbie .NET programmers easily create memory leaks by
> not manually closing resources (in a world where a GC exist, who would have
> thought...)
>
> In the end, the languages are different - the whole conversation about
> Generics could also be a very interesting topic, we could also talk about
> WCF quite a bit. This is where personally the line-to-line port of Lucene
> from Java to .NET is IMHO a difficult endeavor. One would not try to do a
> line-to-line port from Java to Perl. The languages are too different. But I
> think that because people perceive similarities between C# and Java that it
> is assumed that it's a good idea. However - and this is where opinions
> diverge - this would be in my point of view like trying to fit a gas engine
> in a diesel car. While the purposes are the same, the implementations should
> be different (at least in some areas) because the technologies are
> different.
>
> My Canadian 2 cents - subject to the exchange rate
>
>
> Karell Ste-Marie
> C.I.O. - BrainBank Inc
>
> -----Original Message-----
> From: Robert Muir [mailto:rcmuir@gmail.com]
> Sent: Tuesday, January 04, 2011 11:27 AM
> To: lucene-net-dev@lucene.apache.org
> Subject: Re: Proposal Stage: Net Idiomatic Api Version
>
> On Tue, Jan 4, 2011 at 10:49 AM, Peter Mateja <pe...@gmail.com>
> wrote:
> >> I made a request of the community in the Lucere project mailing list
> >> to respond with ideas about what an ideal .NET API would look like,
> >> and how it would function. Specifically, I was hoping to get
> >> pseudo-code examples of how end users would like to use Lucene. Even
> >> something as simple as:
> >>
> >> using(var luceneIndex = new LuceneIndex.Open("C:\foo\bar")) {
> >>  var hitDocs = from doc in luceneIndex where
> >> doc.Field["content"].Match("foo") select doc; }
>
> Hi guys, I know almost nothing of .NET (lucene java developer here), but I
> was hoping I could provide some suggestions here to help out.
>
> In glancing at some of the issues surrounding a more ".NET" api, i couldn't
> help but notice many of the issues people complain about, are because
> lucene.net hasn't implemented lucene 3.0
>
> # lucene 3.0.x is the same feature-wise as lucene 2.9.x, no new features.
> # lucene 3.0.x is Java 5, which is almost a different programming language
> than Java 4 (2.9.x). This means enums, generics, Closeable, foreach (instead
> of Iterators), autoboxing, annotations, etc.
>
> A lot of the issues people have raised seem to be due to the fact that
> lucene 2.9.x is in an ancient java version... I think if you ported 3.0,
> things would look a lot more idiomatic (although surely not perfect for .NET
> users, but better!).
>
> For example, taking a glance, I people making the .NET forks actually doing
> things like taking the 2.9.x code and converting Field.java to use enums,
> which is really a duplication of effort since we did this in java over a
> year ago!:
>
>
> http://svn.apache.org/repos/asf/lucene/java/branches/lucene_3_0/src/java/org/apache/lucene/document/Field.java
>
> So, I'm suggesting that one thing you could consider is to start focusing
> on lucene 3.0.x, to also produce a more idiomatic api automatically, and
> possibly this would be a good enough improvement to bring in some developers
> from those forks.
>
> Separately, I'm trying to understand the syntax you provided about
> IDisposable/using. Obviously, as part of your porting process you could take
> anything marked Closeable [we marked anything wtih a
> .close() as Closeable in Lucene 3.0], and mark it IDisposable, but is this
> really the best approach?
>
> For example, the syntax you provided seems like it would encourage closing
> the IndexReader and opening a new one for each search request... yet this is
> about the biggest no-no you can do with a lucene index... opening a new
> IndexReader is very heavy and you should re-use it across queries and not
> open/close them often... so in this case, a more idiomatic API could
> actually be bad, if it encourages worst practices...
>

RE: Proposal Stage: Net Idiomatic Api Version

Posted by Karell Ste-Marie <st...@brain-bank.com>.
Robert,

Thanks for stepping in, 

I personally found some of your suggestions quite interesting and completely agree that Lucene 3.0 may help quite a bit.

Not that I want to place myself in the bullseye of any .NET snipers out there but the .NET framework (like any others) has its share of quirks. The main one that comes to mind is the garbage collector which is different than in Java. The same can be said for some of the behavior of the CLR when compared to the JVM. I recall implementing IDisposable myself in a few objects and while you may consider that the GC should run and free resources by calling Dispose on an IDisposable object this is actually a technique that is discouraged because there is no telling when the GC will actually free up a resource - you may laugh at this but when it comes to bad practices I've seen newbie .NET programmers easily create memory leaks by not manually closing resources (in a world where a GC exist, who would have thought...)

In the end, the languages are different - the whole conversation about Generics could also be a very interesting topic, we could also talk about WCF quite a bit. This is where personally the line-to-line port of Lucene from Java to .NET is IMHO a difficult endeavor. One would not try to do a line-to-line port from Java to Perl. The languages are too different. But I think that because people perceive similarities between C# and Java that it is assumed that it's a good idea. However - and this is where opinions diverge - this would be in my point of view like trying to fit a gas engine in a diesel car. While the purposes are the same, the implementations should be different (at least in some areas) because the technologies are different.

My Canadian 2 cents - subject to the exchange rate


Karell Ste-Marie
C.I.O. - BrainBank Inc

-----Original Message-----
From: Robert Muir [mailto:rcmuir@gmail.com] 
Sent: Tuesday, January 04, 2011 11:27 AM
To: lucene-net-dev@lucene.apache.org
Subject: Re: Proposal Stage: Net Idiomatic Api Version

On Tue, Jan 4, 2011 at 10:49 AM, Peter Mateja <pe...@gmail.com> wrote:
>> I made a request of the community in the Lucere project mailing list 
>> to respond with ideas about what an ideal .NET API would look like, 
>> and how it would function. Specifically, I was hoping to get 
>> pseudo-code examples of how end users would like to use Lucene. Even 
>> something as simple as:
>>
>> using(var luceneIndex = new LuceneIndex.Open("C:\foo\bar")) {
>>  var hitDocs = from doc in luceneIndex where
>> doc.Field["content"].Match("foo") select doc; }

Hi guys, I know almost nothing of .NET (lucene java developer here), but I was hoping I could provide some suggestions here to help out.

In glancing at some of the issues surrounding a more ".NET" api, i couldn't help but notice many of the issues people complain about, are because lucene.net hasn't implemented lucene 3.0

# lucene 3.0.x is the same feature-wise as lucene 2.9.x, no new features.
# lucene 3.0.x is Java 5, which is almost a different programming language than Java 4 (2.9.x). This means enums, generics, Closeable, foreach (instead of Iterators), autoboxing, annotations, etc.

A lot of the issues people have raised seem to be due to the fact that lucene 2.9.x is in an ancient java version... I think if you ported 3.0, things would look a lot more idiomatic (although surely not perfect for .NET users, but better!).

For example, taking a glance, I people making the .NET forks actually doing things like taking the 2.9.x code and converting Field.java to use enums, which is really a duplication of effort since we did this in java over a year ago!:

http://svn.apache.org/repos/asf/lucene/java/branches/lucene_3_0/src/java/org/apache/lucene/document/Field.java

So, I'm suggesting that one thing you could consider is to start focusing on lucene 3.0.x, to also produce a more idiomatic api automatically, and possibly this would be a good enough improvement to bring in some developers from those forks.

Separately, I'm trying to understand the syntax you provided about IDisposable/using. Obviously, as part of your porting process you could take anything marked Closeable [we marked anything wtih a
.close() as Closeable in Lucene 3.0], and mark it IDisposable, but is this really the best approach?

For example, the syntax you provided seems like it would encourage closing the IndexReader and opening a new one for each search request... yet this is about the biggest no-no you can do with a lucene index... opening a new IndexReader is very heavy and you should re-use it across queries and not open/close them often... so in this case, a more idiomatic API could actually be bad, if it encourages worst practices...

Re: Proposal Stage: Net Idiomatic Api Version

Posted by Robert Muir <rc...@gmail.com>.
On Tue, Jan 4, 2011 at 11:33 AM, Michael Herndon <mh...@o19s.com> wrote:
> Idisposable has more to do releasing resources and gc
>

according to the docs here:
http://msdn.microsoft.com/en-us/library/yh598w02.aspx

It seems the 'using' statement acts like java's 'finally' block (and
will be called even in an exceptional case), thus, if you make all
Closeables as IDisposables, you will get the worst-case performance
pattern that I mentioned.

Many of lucene's objects (IndexReader, IndexWriter) are quite
expensive to close/open and so this is why i mentioned, IDisposable
can be considered harmful.

Re: Proposal Stage: Net Idiomatic Api Version

Posted by Michael Herndon <mh...@o19s.com>.
Idisposable has more to do releasing resources and gc

On Tue, Jan 4, 2011 at 11:27 AM, Robert Muir <rc...@gmail.com> wrote:

> On Tue, Jan 4, 2011 at 10:49 AM, Peter Mateja <pe...@gmail.com>
> wrote:
> >> I made a request of the community in the Lucere project mailing list
> >> to respond with ideas about what an ideal .NET API would look like,
> >> and how it would function. Specifically, I was hoping to get
> >> pseudo-code examples of how end users would like to use Lucene. Even
> >> something as simple as:
> >>
> >> using(var luceneIndex = new LuceneIndex.Open("C:\foo\bar"))
> >> {
> >>  var hitDocs = from doc in luceneIndex where
> >> doc.Field["content"].Match("foo") select doc;
> >> }
>
> Hi guys, I know almost nothing of .NET (lucene java developer here),
> but I was hoping I could provide some suggestions here to help out.
>
> In glancing at some of the issues surrounding a more ".NET" api, i
> couldn't help but notice many of the issues people complain about, are
> because lucene.net hasn't implemented lucene 3.0
>
> # lucene 3.0.x is the same feature-wise as lucene 2.9.x, no new features.
> # lucene 3.0.x is Java 5, which is almost a different programming
> language than Java 4 (2.9.x). This means enums, generics, Closeable,
> foreach (instead of Iterators), autoboxing, annotations, etc.
>
> A lot of the issues people have raised seem to be due to the fact that
> lucene 2.9.x is in an ancient java version... I think if you ported
> 3.0, things would look a lot more idiomatic (although surely not
> perfect for .NET users, but better!).
>
> For example, taking a glance, I people making the .NET forks actually
> doing things like taking the 2.9.x code and converting Field.java to
> use enums, which is really a duplication of effort since we did this
> in java over a year ago!:
>
>
> http://svn.apache.org/repos/asf/lucene/java/branches/lucene_3_0/src/java/org/apache/lucene/document/Field.java
>
> So, I'm suggesting that one thing you could consider is to start
> focusing on lucene 3.0.x, to also produce a more idiomatic api
> automatically, and possibly this would be a good enough improvement to
> bring in some developers from those forks.
>
> Separately, I'm trying to understand the syntax you provided about
> IDisposable/using. Obviously, as part of your porting process you
> could take anything marked Closeable [we marked anything wtih a
> .close() as Closeable in Lucene 3.0], and mark it IDisposable, but is
> this really the best approach?
>
> For example, the syntax you provided seems like it would encourage
> closing the IndexReader and opening a new one for each search
> request... yet this is about the biggest no-no you can do with a
> lucene index... opening a new IndexReader is very heavy and you should
> re-use it across queries and not open/close them often... so in this
> case, a more idiomatic API could actually be bad, if it encourages
> worst practices...
>



-- 
Michael Herndon
Senior Developer (mherndon@o19s.com)
804.767.0083

[connect online]
http://www.opensourceconnections.com
http://www.amptools.net
http://www.linkedin.com/pub/michael-herndon/4/893/23
http://www.facebook.com/amptools.net
http://www.twitter.com/amptools-net

Re: Proposal Stage: Net Idiomatic Api Version

Posted by Robert Muir <rc...@gmail.com>.
On Tue, Jan 4, 2011 at 10:49 AM, Peter Mateja <pe...@gmail.com> wrote:
>> I made a request of the community in the Lucere project mailing list
>> to respond with ideas about what an ideal .NET API would look like,
>> and how it would function. Specifically, I was hoping to get
>> pseudo-code examples of how end users would like to use Lucene. Even
>> something as simple as:
>>
>> using(var luceneIndex = new LuceneIndex.Open("C:\foo\bar"))
>> {
>>  var hitDocs = from doc in luceneIndex where
>> doc.Field["content"].Match("foo") select doc;
>> }

Hi guys, I know almost nothing of .NET (lucene java developer here),
but I was hoping I could provide some suggestions here to help out.

In glancing at some of the issues surrounding a more ".NET" api, i
couldn't help but notice many of the issues people complain about, are
because lucene.net hasn't implemented lucene 3.0

# lucene 3.0.x is the same feature-wise as lucene 2.9.x, no new features.
# lucene 3.0.x is Java 5, which is almost a different programming
language than Java 4 (2.9.x). This means enums, generics, Closeable,
foreach (instead of Iterators), autoboxing, annotations, etc.

A lot of the issues people have raised seem to be due to the fact that
lucene 2.9.x is in an ancient java version... I think if you ported
3.0, things would look a lot more idiomatic (although surely not
perfect for .NET users, but better!).

For example, taking a glance, I people making the .NET forks actually
doing things like taking the 2.9.x code and converting Field.java to
use enums, which is really a duplication of effort since we did this
in java over a year ago!:

http://svn.apache.org/repos/asf/lucene/java/branches/lucene_3_0/src/java/org/apache/lucene/document/Field.java

So, I'm suggesting that one thing you could consider is to start
focusing on lucene 3.0.x, to also produce a more idiomatic api
automatically, and possibly this would be a good enough improvement to
bring in some developers from those forks.

Separately, I'm trying to understand the syntax you provided about
IDisposable/using. Obviously, as part of your porting process you
could take anything marked Closeable [we marked anything wtih a
.close() as Closeable in Lucene 3.0], and mark it IDisposable, but is
this really the best approach?

For example, the syntax you provided seems like it would encourage
closing the IndexReader and opening a new one for each search
request... yet this is about the biggest no-no you can do with a
lucene index... opening a new IndexReader is very heavy and you should
re-use it across queries and not open/close them often... so in this
case, a more idiomatic API could actually be bad, if it encourages
worst practices...

Re: Proposal Stage: Net Idiomatic Api Version

Posted by Peter Mateja <pe...@gmail.com>.
Resharper <http://www.jetbrains.com/resharper/>is a fantastic tool for
auto-formatting code to a particular standard.  I haven't done a complete
sweep, but it seems that the default settings match the Microsoft guidelines
closely.  It isn't free unfortunately, but if you're a professional .Net
developer it makes life much easier!

Also, I 2nd the Krzysztof book.  Excellent reading.  I'll dig it out and
give it another scan.

Peter Mateja
peter.mateja@gmail.com



On Fri, Dec 31, 2010 at 6:50 PM, Troy Howard <th...@gmail.com> wrote:

> I agree with the suggestion to follow the MS Coding standard. It's a
> good general guideline. Specifically, I'd like to follow all
> guidelines put forth in the book:
>
> Framework Design Guidelines: Conventions, Idioms, and Patterns for
> Reusable .NET Libraries by Krzysztof Cwalina and Brad Abrams
> http://amzn.com/0321246756
>
> There's also a lecture that Krysztof gave that's available as a
> offline video download here (the streaming version isn't available at
> the moment for some reason):
>
>
> http://download.microsoft.com/download/8/0/8/808412ec-2561-413d-a9e3-5cd47d37d763/FDGNetCast.zip
>
>
> With regards to the specifics of the API, I think we should try to
> bring together the existing forks (Lucere, Lucille, and Aimee.Net) and
> attempt to merge them into a single consistent alternative API for
> Lucene.Net. They all use similar but slightly different tactics to
> ".NETify" the codebase.
>
> Also, significant community feedback will be necessary before we
> proceed to far down that road. We'll have a lot of work ahead of us
> just getting up to date releases finished for the 1:1 API port. It's
> my opinion though, that these can be separate and parallel development
> efforts.
>
> I made a request of the community in the Lucere project mailing list
> to respond with ideas about what an ideal .NET API would look like,
> and how it would function. Specifically, I was hoping to get
> pseudo-code examples of how end users would like to use Lucene. Even
> something as simple as:
>
> using(var luceneIndex = new LuceneIndex.Open("C:\foo\bar"))
> {
>  var hitDocs = from doc in luceneIndex where
> doc.Field["content"].Match("foo") select doc;
> }
>
> This represents a lot of ideas all in one little code snippet. Maybe
> this isn't an ideal API, maybe it is... If we collect a bunch of code
> samples from people like this, we can discuss the merits of various
> ideas for the API and settle on an ideal way to present the
> functionality of the library in a way that will integrate well with
> the .NET 3.5/4.0 environment.
>
> I didn't get a lot of responses in the Lucere mailing list but perhaps
> the Lucene.Net community will have some ideas. We should probably
> cross-post to the lucene-net-user mailing list with a request for
> ideas.
>
> Thanks,
> Troy
>
>
> On Fri, Dec 31, 2010 at 1:35 PM, Michael Herndon <mh...@o19s.com>
> wrote:
> > *Net Idiomatic Api Version*
> > *We should probably be looking for with this criteria is readability &
> > getting people familiar with any new code base faster within their own
> > Idiom.  *
> > *
> > *
> > Starting with a proposal that we use the internal Ms coding
> > guidelines<http://blogs.msdn.com/b/brada/archive/2005/01/26/361363.aspx>
> > for
> > the idiomatic version, not to make anyone's life miserable or coding less
> > enjoyable or anything.
> >
> > But its already documented, we can easily point to it without having to
> > write up our own guidelines, and everyone who works inside of .net should
> be
> > remotely familiar with it, meaning someone can just come in and crank out
> > code.
> >
> > If need be, we let people work on the code base in their own style and
> when
> > they are done working on a particular area, let them reformat it or just
> run
> > a tool that auto formats code before each release.
> >
> > I know their is religious wars fought over this stuff, I don't want to
> > create one.  I could be wrong about the above, but what again, the goals
> > should be familiarity, comfort, creating a bigger community.
> >
> > Also uses of core Interfaces, Annotations, & Classes where possible.
>  (What
> > are some of these that you would like to see other than IDisposable?)
> >
> > A good book to comb over with the latest edition is the  "Framework
> Design
> > Guidelines" 2nd edition.
> >
> > *
> > *
> >
> >
> > --
> > Michael Herndon
> >
>