You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Mike Hogan <me...@mikehogan.net> on 2003/10/05 15:56:25 UTC

Re: [subscriptions] Re: Please make org.apache.lucene.index.IndexWriter non-final

Erik,

> Mike,
>
> I disagree with your request to change things just for testing
> purposes.  There are ways to test most (if not all) of Lucene without
> having to affect method/class access levels.  As big as I am into
> testing, I'm also big into using the right design for the right job.
> In the case of Lucene, 'final' is used liberally as well as other
> non-public access levels.
>
> Could you give us an example of what you're trying to test and how
> you're wishing you could go about it so that we could perhaps offer
> alternatives?

I am trying to test this:

public void index(String componentId, String componentDescription) throws
SearchService.Exception {
        IndexWriter writer = null;
        try {
            writer = new IndexWriter(INDEX_FILE_PATH, new
StandardAnalyzer(), !indexExists());
            final Document document = new Document();
            document.add(Field.Text("id", componentId));
            document.add(Field.Text("contents", componentDescription));
            writer.addDocument(document);
            writer.optimize();
            writer.close();
        } catch (IOException e) {
            throw new SearchService.Exception("Exception updating Lucene
index", e);
        }
    }

I want to do it without a dependency on the file system.  So I want a mock
IndexWriter that does what I configure it to, sometimes throwing an
IOException, some times not, and always storing the Document passed to it,
so I can verify() the document is as it should be.

> Look at Lucene's current codebase.  I've added a couple of mock objects
> recently to test various things - maybe that could give you some ideas?

I am not trying to unit test Lucene.  I am trying to unit test an
application of Lucene.

What ya reckon?

Thanks,
Mike.


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Re: [subscriptions] Re: Please make org.apache.lucene.index.IndexWriter non-final

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Sunday, October 5, 2003, at 11:01  AM, Mike Hogan wrote:
> This is true, but I enter into the relationship with mocks with my eyes
> open.  If I was trying to mock-out some really complex API, with a lot 
> of
> states etc, then your criticism would be justified.  But here I am 
> trying to
> mock-out a few simple API calls, which may grow in the future, yes, 
> but not
> hugely.  Personally I am happen with the risk in this case.  If only 
> the
> classes I am trying to mock-out where non-final.....:)

I think your design is the one we need to call into question here.  If 
you have a few simple API calls (your own, that is, the search one you 
showed before), then your design should accommodate the mock situation 
you're grasping for here.  Again, how about a SearchManager interface, 
and then a LuceneSearchManager implementation, then a MockSearchManager 
interface that you can send into your higher level methods to see what 
happens without actually calling Lucene at all?  Or perhaps a 
DirectoryFactory that is used to construct either a RAMDirectory (for 
testing) or FSDirectory (for production)?

>> away with the bare minimum of functionality. I've yet to come across a
>> mock API that actually works as well as the real thing.
>
> Thats the point - they are not supposed to work at all, never mind as 
> well
> as the real thing.  But I know you know that.

I disagree here.  A mock *is* supposed to work.  Its the Real Deal, but 
with some reporting capabilities added on so you can see what happens 
to it as it goes through a black box.  We need to know precisely what 
we are testing here to really say whether a particular use of mocks is 
bad or good, but I'm sitting between Mike and Hani here.... mocks are 
great, but must be focused on one layer to test, not more.  Mocks can 
be abused, and this I feel is Mike's situation.  Mike.... refactor and 
redesign - don't try to change Lucene to make your testing easier - 
what you're after can be done without touching Lucene.  Besides, we 
won't let you touch Lucene that way anyway :))

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Re: [subscriptions] Re: Please make org.apache.lucene.index.IndexWriter non-final

Posted by Mike Hogan <me...@mikehogan.net>.
Hani,

> No matter how good your mock is, it won't be as good as a lucene
> provided and maintained implementation (RAMDirectory). Mocks in fact
> are dangerous for that very reason, they seek to emulate behaviour, and
> give you a false sense of security in the correctness of the mock.

This is true, but I enter into the relationship with mocks with my eyes
open.  If I was trying to mock-out some really complex API, with a lot of
states etc, then your criticism would be justified.  But here I am trying to
mock-out a few simple API calls, which may grow in the future, yes, but not
hugely.  Personally I am happen with the risk in this case.  If only the
classes I am trying to mock-out where non-final.....:)

> The same argument is extended to JDBC usage. I much prefer to have my
> unit tests call mckoi or some other java db that can be run inline with
> an in-memory store. it's a real world db, rather than an attempt to get
> away with the bare minimum of functionality. I've yet to come across a
> mock API that actually works as well as the real thing.

Thats the point - they are not supposed to work at all, never mind as well
as the real thing.  But I know you know that.  I agree that there is a lot
more room for error and a false sense of security when mock-testing JDBC
code.  I have been burned by that myself.  Like so much in this great
science of ours, it comes down to tradeoffs.  In the case of JDBC, its
complex enough to require Real Deal testing and mock-testing.  In this
simple case, I find mock-testing is sufficient.

Cheers,
Mike.


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Re: [subscriptions] Re: Please make org.apache.lucene.index.IndexWriter non-final

Posted by Hani Suleiman <ha...@formicary.net>.
On Sunday, October 5, 2003, at 02:28 PM, Erik Hatcher wrote:

> I'm curious what your take of the two mocks I've recently added to 
> Lucene's codebase is - MockFilter and MockInputStream.  Both of these 
> are for sending in to somewhere else and probing what happens to them. 
>  How else could a Real Deal (tm) work in these scenarios?  And, I 
> would argue that these two are the real deals since they merely are 
> implementations (although certainly minimal and exploratory) of the 
> real interfaces of Lucene.
>
I looked at both of these and I think that in this case neither of them 
seems an 'abuse' of mocking. If anything, this is the exact usage of 
what a mock should be, whiteboxing a blackbox scenario, where you can 
interact with it, then poke about and verify the results of that 
interaction.

I don't know if you're used dynamic mocks or the big fat library of 
mocks in the mockobjects-java project, but THAT stuff to me is a pretty 
blatant abuse. Full (useless)  mocks of all of j2ee, where you're 
endlessly forced to define expectations and results for every call. 
Yuck!

Hani


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Re: [subscriptions] Re: Please make org.apache.lucene.index.IndexWriter non-final

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Sunday, October 5, 2003, at 10:50  AM, Hani Suleiman wrote:
> To be honest (and yes, this is becoming a bit OT so hit delete if 
> you're so inclined)...

We'll indulge ourselves a bit more here before we let this thread go :)

> The same argument is extended to JDBC usage. I much prefer to have my 
> unit tests call mckoi or some other java db that can be run inline 
> with an in-memory store. it's a real world db, rather than an attempt 
> to get away with the bare minimum of functionality. I've yet to come 
> across a mock API that actually works as well as the real thing. To do 
> that you often have to fill in all sorts of bits by hand, thus 
> introducing the headache of yet more code to keep up to date and 
> maintain, when you could have just used the Real Deal (tm) in the 
> first place.

I think it really boils down to what you're trying to test.  We all 
need to know precisely what we're trying to test because we want to 
focus a test precisely on a small "unit".  With JDBC - if you're 
testing how your code interacts with the JDBC layer itself, it makes 
sense to use a mock to avoid testing the actual implementation of the 
driver as well and blurring what is being tested and what the results 
may mean.

I'm curious what your take of the two mocks I've recently added to 
Lucene's codebase is - MockFilter and MockInputStream.  Both of these 
are for sending in to somewhere else and probing what happens to them.  
How else could a Real Deal (tm) work in these scenarios?  And, I would 
argue that these two are the real deals since they merely are 
implementations (although certainly minimal and exploratory) of the 
real interfaces of Lucene.

> Hmmm, time to blog ;)

I agree!

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Re: [subscriptions] Re: Please make org.apache.lucene.index.IndexWriter non-final

Posted by Hani Suleiman <ha...@formicary.net>.
To be honest (and yes, this is becoming a bit OT so hit delete if  
you're so inclined)...

I think that this shows an obsession with mocks for the sake of mocks,  
rather than any real benefit.

No matter how good your mock is, it won't be as good as a lucene  
provided and maintained implementation (RAMDirectory). Mocks in fact  
are dangerous for that very reason, they seek to emulate behaviour, and  
give you a false sense of security in the correctness of the mock.

The same argument is extended to JDBC usage. I much prefer to have my  
unit tests call mckoi or some other java db that can be run inline with  
an in-memory store. it's a real world db, rather than an attempt to get  
away with the bare minimum of functionality. I've yet to come across a  
mock API that actually works as well as the real thing. To do that you  
often have to fill in all sorts of bits by hand, thus introducing the  
headache of yet more code to keep up to date and maintain, when you  
could have just used the Real Deal (tm) in the first place.

Hmmm, time to blog ;)

Hani

On Sunday, October 5, 2003, at 10:39 AM, Mike Hogan wrote:

> Erik,
>
>> Looks pretty much like you're testing Lucene here, not your  
>> application
>> around it.... nothing inside the try block is your own stuff its just
>> Lucene API calls.
>
> No, I am not testing Lucene.  I have written code that uses Lucene,  
> yes, but
> I want to _divorce_ myself from Lucene when it comes to unit testing  
> that
> code.  I just want to make sure I call the Lucene API correctly.  Not  
> by
> actually calling it in actuality, but by calling mock equivalents and
> verify()'ing that I got the correct sequence of calls.  I do not want  
> to
> have to worry about Lucene or the file system or a RAMDirectory.  This  
> is
> what mock object testing is all about, as I understand it.  Its the  
> same
> when unit testing code that does a bunch of JDBC calls.  I do not  
> actually
> want to call the real JDBC driver, I want to call a mock version and
> verify() that the parameters to my PreparedStatement and that my  
> connections
> were closed etc etc.  Involving the JDBC driver for real is an  
> integration
> test, not a unit test.
>
>>> What ya reckon?
>>
>> I reckon ya oughta have a look at Lucene's test cases and source code.
>> Ever hear of RAMDirectory?!  :))
>
> No, I have not until now :-)  I will take a look at the Lucene test  
> cases,
> but if you are suggesting that I should slot in a file system  
> replacement to
> test my application, you're asking me to take the harder of two roads.
> Doing this makes sense for those testing Lucene itself, but not for me  
> (in
> the same way as using a database replacement makes sense for those  
> testing
> JDBC drivers).  As I said, I am concerned only with verifying that I  
> called
> the Lucene API correctly, not that Lucene does what its supposed to do  
> - I
> will test that separately.
>
> One final thing: what is the rationale behind using final at all on a  
> class?
> I come from this line of thinking:
> http://lists.codehaus.org/pipermail/picocontainer-dev/2003-July/ 
> 000743.html,
> so am interested to see what value you get from final.
>
> Cheers,
> Mike.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Re: [subscriptions] Re: Please make org.apache.lucene.index.IndexWriter non-final

Posted by Mike Hogan <me...@mikehogan.net>.
Erik,

> Again, a RAMDirectory would allow you to verify that you called the
> Lucene API correctly.  The code you showed, though, was locked into
> using a file system index making it inflexible and tougher to test.
> Your test case could construct a RAMDirectory with some know documents
> indexed, then call your search method and assert it got the right ones
> back.  Yes, this impacts your design - no question.  But this is a
> benefit of unit test driven development - you'll end up with a better
> design *because* you've refactored during testing.

A few posts back I pasted the refactored design I had when I tried to
mock-test my code.  You are right to say my code needs refactoring to unit
test.  I just rolled it back to the untestable state once I ran into the
public final IndexWriter problem.

> I would even go so far as to argue that RAMDirectory is a mock object
> (albeit a very sophisticated one) of sorts.

Yes it is.  But its too sophisticated for what I need.  I can do the same
with a small test case and two small mocks.  My preference only.  Hence this
email thread.

> I think anyone arguing that final is bad because it makes unit testing
> more difficult is not quite seeing the big picture.  'final' is a good
> thing when used appropriately.  And it does not make testing any harder
> when using 'final' the right decision for your design.  In the case of
> Lucene, protection access was very well thought out and it is by
> design.  It is always up for debate, but making something easier to
> test is not going to be a sufficient enough reason to change it.  Is
> there a real issue with Lucene being inflexible for extension in spots?
>   History has proven it and things have been opened up in the past
> couple of releases of Lucene, but there real use cases that demanded
> this, not just testing.

Well, this is a side issue and I probably should not have started it, but
I've started so I might as well continue :-).  What is the benefit of final
on a class?  Are you concerned that somebody will subclass an IndexWriter
and shag things up?  If so, shag _them_!

Cheers,
Mike.


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Re: [subscriptions] Re: Please make org.apache.lucene.index.IndexWriter non-final

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Sunday, October 5, 2003, at 10:39  AM, Mike Hogan wrote:
> JDBC drivers).  As I said, I am concerned only with verifying that I  
> called
> the Lucene API correctly, not that Lucene does what its supposed to do  
> - I
> will test that separately.

Again, a RAMDirectory would allow you to verify that you called the  
Lucene API correctly.  The code you showed, though, was locked into  
using a file system index making it inflexible and tougher to test.   
Your test case could construct a RAMDirectory with some know documents  
indexed, then call your search method and assert it got the right ones  
back.  Yes, this impacts your design - no question.  But this is a  
benefit of unit test driven development - you'll end up with a better  
design *because* you've refactored during testing.

I would even go so far as to argue that RAMDirectory is a mock object  
(albeit a very sophisticated one) of sorts.

> One final thing: what is the rationale behind using final at all on a  
> class?
> I come from this line of thinking:
> http://lists.codehaus.org/pipermail/picocontainer-dev/2003-July/ 
> 000743.html,
> so am interested to see what value you get from final.

I think anyone arguing that final is bad because it makes unit testing  
more difficult is not quite seeing the big picture.  'final' is a good  
thing when used appropriately.  And it does not make testing any harder  
when using 'final' the right decision for your design.  In the case of  
Lucene, protection access was very well thought out and it is by  
design.  It is always up for debate, but making something easier to  
test is not going to be a sufficient enough reason to change it.  Is  
there a real issue with Lucene being inflexible for extension in spots?  
  History has proven it and things have been opened up in the past  
couple of releases of Lucene, but there real use cases that demanded  
this, not just testing.

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Re: [subscriptions] Re: Please make org.apache.lucene.index.IndexWriter non-final

Posted by Mike Hogan <me...@mikehogan.net>.
Erik,

> Ummm.... but.... you must be using Lucene 1.2 as IndexWriter was made
> non-final over a year ago.

Yes, I am using 1.2.   1.3 is only at RC1 as far as I could see, certainly
this indicates so: http://jakarta.apache.org/site/binindex.cgi.  What
version would you advise me to use?

> So, now that you ultimately have what you asked for, could you do us a
> favor and post back how you plan on using this, and how it will impact
> the design of your production code?  I really am curious how a
> MockIndexWriter will be written to really benefit what you're after.

I posted refactored code a few posts back.  But basically I delegated the
creation of Lucene classes to an interface, of which there are two impls:
the production one and the mock one.

Cheers,
Mike.


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Re: [subscriptions] Re: Please make org.apache.lucene.index.IndexWriter non-final

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
Ummm.... but.... you must be using Lucene 1.2 as IndexWriter was made 
non-final over a year ago.

So, now that you ultimately have what you asked for, could you do us a 
favor and post back how you plan on using this, and how it will impact 
the design of your production code?  I really am curious how a 
MockIndexWriter will be written to really benefit what you're after.

	Erik


---- from cvs log....
revision 1.7
date: 2002/07/17 17:38:04;  author: cutting;  state: Exp;  lines: +7 -7
Made many methods and classes non-final, per requests.  This includes
IndexWriter and IndexSearcher, among others.


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Re: [subscriptions] Re: Please make org.apache.lucene.index.IndexWriter non-final

Posted by Mike Hogan <me...@mikehogan.net>.
Erik,

> Looks pretty much like you're testing Lucene here, not your application
> around it.... nothing inside the try block is your own stuff its just
> Lucene API calls.

No, I am not testing Lucene.  I have written code that uses Lucene, yes, but
I want to _divorce_ myself from Lucene when it comes to unit testing that
code.  I just want to make sure I call the Lucene API correctly.  Not by
actually calling it in actuality, but by calling mock equivalents and
verify()'ing that I got the correct sequence of calls.  I do not want to
have to worry about Lucene or the file system or a RAMDirectory.  This is
what mock object testing is all about, as I understand it.  Its the same
when unit testing code that does a bunch of JDBC calls.  I do not actually
want to call the real JDBC driver, I want to call a mock version and
verify() that the parameters to my PreparedStatement and that my connections
were closed etc etc.  Involving the JDBC driver for real is an integration
test, not a unit test.

> > What ya reckon?
>
> I reckon ya oughta have a look at Lucene's test cases and source code.
> Ever hear of RAMDirectory?!  :))

No, I have not until now :-)  I will take a look at the Lucene test cases,
but if you are suggesting that I should slot in a file system replacement to
test my application, you're asking me to take the harder of two roads.
Doing this makes sense for those testing Lucene itself, but not for me (in
the same way as using a database replacement makes sense for those testing
JDBC drivers).  As I said, I am concerned only with verifying that I called
the Lucene API correctly, not that Lucene does what its supposed to do - I
will test that separately.

One final thing: what is the rationale behind using final at all on a class?
I come from this line of thinking:
http://lists.codehaus.org/pipermail/picocontainer-dev/2003-July/000743.html,
so am interested to see what value you get from final.

Cheers,
Mike.


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Re: [subscriptions] Re: Please make org.apache.lucene.index.IndexWriter non-final

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Sunday, October 5, 2003, at 09:56  AM, Mike Hogan wrote:
> I am trying to test this:
>
> public void index(String componentId, String componentDescription) 
> throws
> SearchService.Exception {
>         IndexWriter writer = null;
>         try {
>             writer = new IndexWriter(INDEX_FILE_PATH, new
> StandardAnalyzer(), !indexExists());
>             final Document document = new Document();
>             document.add(Field.Text("id", componentId));
>             document.add(Field.Text("contents", componentDescription));
>             writer.addDocument(document);
>             writer.optimize();
>             writer.close();
>         } catch (IOException e) {
>             throw new SearchService.Exception("Exception updating 
> Lucene
> index", e);
>         }
>     }

Looks pretty much like you're testing Lucene here, not your application 
around it.... nothing inside the try block is your own stuff its just 
Lucene API calls.

>
> What ya reckon?

I reckon ya oughta have a look at Lucene's test cases and source code.  
Ever hear of RAMDirectory?!  :))

	Erik



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org