You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Shai Erera <se...@gmail.com> on 2010/11/30 07:45:58 UTC

API Semantics and Backwards

Hi

I'd like to discuss the semantics of our API and how backwards tests relate
to it. First, I'd like to confirm my understanding - currently it relates to
3x, but it will apply to 4x after 4.0 will be released:

Public/Protected -- this API is 'public' and we should maintain back-compat,
in the form of jar drop-in. That is, we cannot rename or modify it, w/o
deprecating first (I leave *exceptions* deliberately outside the
discussion).

Package-private -- this is not public API and while users can use it if they
declare their classes under the relevant package, they should not expect jar
drop-in support.

Public @internal -- this is public API following Java language, however not
public to the users. We need to make this API public so that Lucene can
access it, but it's used for internal purposes only. Users can still use it,
however cannot expect jar drop-in support.

Public @experimental -- this API is intended to be made 'public' one day,
however we're still working on it, and even though it's checked-in or even
released, it may change unexpectedly. Not sure we want to say that jar
drop-in support cannot be expected, though according to the definition we
are allowed to change it ... so perhaps it's like @internal, only w/ the
intention to make it public one day.

Both @internal and @experimental tags should be removed if they do not apply
anymore.

Now comes the question about backwards tests -- our tests touch all the API
types above, however they are not resilient to changes in 3 out of 4 of
them. In the past this wasn't a problem - the backwards layer had both
src/java and src/test, tests we compiled against src/java and then executed
against core.jar. This allowed changing the source code of the "non public"
API and make the same changes to backwards/src/java, and tests would still
run. This had a disadvantage too - it was 'easier' to break back-compat on
the first API type (the *true* public API) because you could still change
bw/src/java and be done w/ it.

Today though bw tests are compiled against the previous release source. But
if you make changes to the non public API, they break while they shouldn't.
So the question is what can we do about the backwards tests so that we can
still make allowed changes to the API w/o them breaking?

* We can say that unit tests should not test package-private / @internal /
@experimental classes, but I don't believe in it.

* We can re-introduce bw/src/java and ask all committers to make careful
changes to it. If we're careful, we won't introduce any *true* public API
break.

The second is the more realistic solution IMO, but since this was the
situation in the past and changed to how it is today, I don't know if it's
acceptable.

Whatever we do though, we cannot have backwards tests dictate what is public
API and what isn't, because bw tests are compiled following Java semantics,
that have nothing to do w/ Lucene's 'public' notion.

Shai

Re: API Semantics and Backwards

Posted by Earwin Burrfoot <ea...@gmail.com>.
Oh, Shai already said this, so +1.

On Tue, Nov 30, 2010 at 13:11, Earwin Burrfoot <ea...@gmail.com> wrote:
> We can try writing tests that only check binary compatibility for
> public/protected members?
> And use these for back-compat testing.
>
> On Tue, Nov 30, 2010 at 12:47, Shai Erera <se...@gmail.com> wrote:
>> I realize the benefits of not storing the backwards source -- I don't care
>> too much about the size of the checkout, but more about it making
>> introducing bw breaks easier.
>>
>> And that that it happened w/ MockRAMDir only so far, doesn't mean it won't
>> bite us somewhere else too. But it's not a good enough reason to create a
>> bw/src/java either.
>>
>> It would be great if we can remove all the unrelated tests from backwards.
>> As I see it, we should have two types of tests - those that check that
>> *public* API hasn't changed, and this can be in the form of reflection or
>> simply creating classes that call/extend the public API. Also, we want to
>> have tests that assert *runtime* backwards support, such as for Tokenizers,
>> QueryParser etc. For those we can have special test cases that assert
>> exactly that.
>>
>> Like you said, the rest of the tests just increase the test running time.
>>
>> Shai
>>
>> On Tue, Nov 30, 2010 at 9:50 AM, Uwe Schindler <uw...@thetaphi.de> wrote:
>>>
>>> Hi Shai,
>>>
>>>
>>>
>>> there are few comments:
>>>
>>>
>>>
>>> You are right with „we should not check backwards” for internal or
>>> *experimental* apis, the problem is that some tests use these internal APIs
>>> to check some internals. But as the idea behind backwards tests say: we test
>>> only drop in replacements. The functionality behind the tests is just
>>> nonsense, as the functionality is already tested by the main tests. So we
>>> just run the tests two times. Because of this, we can simply disable all
>>> tests that *explicitely* check internal APIs. An example would be a test
>>> that checks the exact class names of some returned objects (e.g. in
>>> MultiTermQuery rewrites).
>>>
>>>
>>>
>>> The problem are Mock classes in the backwards tests, that check internal
>>> apis (the famous example is MockRAMDirectory) as it is used by almost every
>>> test. If we would disable all these tests, we would not have the possibility
>>> to test any of them. For the current problem (and another one with this
>>> exact same class), we have a solution, I attached it to the issue. It’s a 10
>>> lines patch, it’s a hack, but its better than living with the cruft of
>>> having a modifiable backwards branch. If you really have to change these
>>> mock classes, you can do it like in the patch – but then you know you are
>>> doing something special and you can *mark* it as such (like I did in my
>>> patch).
>>>
>>>
>>>
>>> I am against reinserting the previous version’s classes for several
>>> reasons:
>>>
>>> -          The checkout gets big
>>>
>>> -          When we release a bugfix release of the previous version, we
>>> should be able to replace the old jar file by the bugfix one. I will sonn
>>> replace lucene-core-3.0.2.jar with 3.0.3.
>>>
>>> -          We should really don’t ever change the core test, because it
>>> contradicts the sense of backwards tests. If we really need to fix it (like
>>> for mock classes that are used by every test), it can be done in various
>>> ways: Remove the extra check code from the mock classes (this is often
>>> easiest) or use a reflection hack (we only have two of them now – but you
>>> changed the backwards branch much more often before I reverted it when
>>> adding the previous jar file).
>>>
>>> -          For tests that simply test some internal apis and nothing else
>>> important (for plugin compatibility): lets comment out the test. In the bw
>>> folder we did this with a special comment to leave the code intact.
>>>
>>>
>>>
>>> Uwe
>>>
>>>
>>>
>>> -----
>>>
>>> Uwe Schindler
>>>
>>> H.-H.-Meier-Allee 63, D-28213 Bremen
>>>
>>> http://www.thetaphi.de
>>>
>>> eMail: uwe@thetaphi.de
>>>
>>>
>>>
>>> From: Shai Erera [mailto:serera@gmail.com]
>>> Sent: Tuesday, November 30, 2010 7:46 AM
>>> To: dev@lucene.apache.org
>>> Subject: API Semantics and Backwards
>>>
>>>
>>>
>>> Hi
>>>
>>> I'd like to discuss the semantics of our API and how backwards tests
>>> relate to it. First, I'd like to confirm my understanding - currently it
>>> relates to 3x, but it will apply to 4x after 4.0 will be released:
>>>
>>> Public/Protected -- this API is 'public' and we should maintain
>>> back-compat, in the form of jar drop-in. That is, we cannot rename or modify
>>> it, w/o deprecating first (I leave *exceptions* deliberately outside the
>>> discussion).
>>>
>>> Package-private -- this is not public API and while users can use it if
>>> they declare their classes under the relevant package, they should not
>>> expect jar drop-in support.
>>>
>>> Public @internal -- this is public API following Java language, however
>>> not public to the users. We need to make this API public so that Lucene can
>>> access it, but it's used for internal purposes only. Users can still use it,
>>> however cannot expect jar drop-in support.
>>>
>>> Public @experimental -- this API is intended to be made 'public' one day,
>>> however we're still working on it, and even though it's checked-in or even
>>> released, it may change unexpectedly. Not sure we want to say that jar
>>> drop-in support cannot be expected, though according to the definition we
>>> are allowed to change it ... so perhaps it's like @internal, only w/ the
>>> intention to make it public one day.
>>>
>>> Both @internal and @experimental tags should be removed if they do not
>>> apply anymore.
>>>
>>> Now comes the question about backwards tests -- our tests touch all the
>>> API types above, however they are not resilient to changes in 3 out of 4 of
>>> them. In the past this wasn't a problem - the backwards layer had both
>>> src/java and src/test, tests we compiled against src/java and then executed
>>> against core.jar. This allowed changing the source code of the "non public"
>>> API and make the same changes to backwards/src/java, and tests would still
>>> run. This had a disadvantage too - it was 'easier' to break back-compat on
>>> the first API type (the *true* public API) because you could still change
>>> bw/src/java and be done w/ it.
>>>
>>> Today though bw tests are compiled against the previous release source.
>>> But if you make changes to the non public API, they break while they
>>> shouldn't. So the question is what can we do about the backwards tests so
>>> that we can still make allowed changes to the API w/o them breaking?
>>>
>>> * We can say that unit tests should not test package-private / @internal /
>>> @experimental classes, but I don't believe in it.
>>>
>>> * We can re-introduce bw/src/java and ask all committers to make careful
>>> changes to it. If we're careful, we won't introduce any *true* public API
>>> break.
>>>
>>> The second is the more realistic solution IMO, but since this was the
>>> situation in the past and changed to how it is today, I don't know if it's
>>> acceptable.
>>>
>>> Whatever we do though, we cannot have backwards tests dictate what is
>>> public API and what isn't, because bw tests are compiled following Java
>>> semantics, that have nothing to do w/ Lucene's 'public' notion.
>>>
>>> Shai
>>
>
>
>
> --
> Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
> Phone: +7 (495) 683-567-4
> ICQ: 104465785
>



-- 
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Phone: +7 (495) 683-567-4
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: API Semantics and Backwards

Posted by Earwin Burrfoot <ea...@gmail.com>.
We can try writing tests that only check binary compatibility for
public/protected members?
And use these for back-compat testing.

On Tue, Nov 30, 2010 at 12:47, Shai Erera <se...@gmail.com> wrote:
> I realize the benefits of not storing the backwards source -- I don't care
> too much about the size of the checkout, but more about it making
> introducing bw breaks easier.
>
> And that that it happened w/ MockRAMDir only so far, doesn't mean it won't
> bite us somewhere else too. But it's not a good enough reason to create a
> bw/src/java either.
>
> It would be great if we can remove all the unrelated tests from backwards.
> As I see it, we should have two types of tests - those that check that
> *public* API hasn't changed, and this can be in the form of reflection or
> simply creating classes that call/extend the public API. Also, we want to
> have tests that assert *runtime* backwards support, such as for Tokenizers,
> QueryParser etc. For those we can have special test cases that assert
> exactly that.
>
> Like you said, the rest of the tests just increase the test running time.
>
> Shai
>
> On Tue, Nov 30, 2010 at 9:50 AM, Uwe Schindler <uw...@thetaphi.de> wrote:
>>
>> Hi Shai,
>>
>>
>>
>> there are few comments:
>>
>>
>>
>> You are right with „we should not check backwards” for internal or
>> *experimental* apis, the problem is that some tests use these internal APIs
>> to check some internals. But as the idea behind backwards tests say: we test
>> only drop in replacements. The functionality behind the tests is just
>> nonsense, as the functionality is already tested by the main tests. So we
>> just run the tests two times. Because of this, we can simply disable all
>> tests that *explicitely* check internal APIs. An example would be a test
>> that checks the exact class names of some returned objects (e.g. in
>> MultiTermQuery rewrites).
>>
>>
>>
>> The problem are Mock classes in the backwards tests, that check internal
>> apis (the famous example is MockRAMDirectory) as it is used by almost every
>> test. If we would disable all these tests, we would not have the possibility
>> to test any of them. For the current problem (and another one with this
>> exact same class), we have a solution, I attached it to the issue. It’s a 10
>> lines patch, it’s a hack, but its better than living with the cruft of
>> having a modifiable backwards branch. If you really have to change these
>> mock classes, you can do it like in the patch – but then you know you are
>> doing something special and you can *mark* it as such (like I did in my
>> patch).
>>
>>
>>
>> I am against reinserting the previous version’s classes for several
>> reasons:
>>
>> -          The checkout gets big
>>
>> -          When we release a bugfix release of the previous version, we
>> should be able to replace the old jar file by the bugfix one. I will sonn
>> replace lucene-core-3.0.2.jar with 3.0.3.
>>
>> -          We should really don’t ever change the core test, because it
>> contradicts the sense of backwards tests. If we really need to fix it (like
>> for mock classes that are used by every test), it can be done in various
>> ways: Remove the extra check code from the mock classes (this is often
>> easiest) or use a reflection hack (we only have two of them now – but you
>> changed the backwards branch much more often before I reverted it when
>> adding the previous jar file).
>>
>> -          For tests that simply test some internal apis and nothing else
>> important (for plugin compatibility): lets comment out the test. In the bw
>> folder we did this with a special comment to leave the code intact.
>>
>>
>>
>> Uwe
>>
>>
>>
>> -----
>>
>> Uwe Schindler
>>
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>>
>> http://www.thetaphi.de
>>
>> eMail: uwe@thetaphi.de
>>
>>
>>
>> From: Shai Erera [mailto:serera@gmail.com]
>> Sent: Tuesday, November 30, 2010 7:46 AM
>> To: dev@lucene.apache.org
>> Subject: API Semantics and Backwards
>>
>>
>>
>> Hi
>>
>> I'd like to discuss the semantics of our API and how backwards tests
>> relate to it. First, I'd like to confirm my understanding - currently it
>> relates to 3x, but it will apply to 4x after 4.0 will be released:
>>
>> Public/Protected -- this API is 'public' and we should maintain
>> back-compat, in the form of jar drop-in. That is, we cannot rename or modify
>> it, w/o deprecating first (I leave *exceptions* deliberately outside the
>> discussion).
>>
>> Package-private -- this is not public API and while users can use it if
>> they declare their classes under the relevant package, they should not
>> expect jar drop-in support.
>>
>> Public @internal -- this is public API following Java language, however
>> not public to the users. We need to make this API public so that Lucene can
>> access it, but it's used for internal purposes only. Users can still use it,
>> however cannot expect jar drop-in support.
>>
>> Public @experimental -- this API is intended to be made 'public' one day,
>> however we're still working on it, and even though it's checked-in or even
>> released, it may change unexpectedly. Not sure we want to say that jar
>> drop-in support cannot be expected, though according to the definition we
>> are allowed to change it ... so perhaps it's like @internal, only w/ the
>> intention to make it public one day.
>>
>> Both @internal and @experimental tags should be removed if they do not
>> apply anymore.
>>
>> Now comes the question about backwards tests -- our tests touch all the
>> API types above, however they are not resilient to changes in 3 out of 4 of
>> them. In the past this wasn't a problem - the backwards layer had both
>> src/java and src/test, tests we compiled against src/java and then executed
>> against core.jar. This allowed changing the source code of the "non public"
>> API and make the same changes to backwards/src/java, and tests would still
>> run. This had a disadvantage too - it was 'easier' to break back-compat on
>> the first API type (the *true* public API) because you could still change
>> bw/src/java and be done w/ it.
>>
>> Today though bw tests are compiled against the previous release source.
>> But if you make changes to the non public API, they break while they
>> shouldn't. So the question is what can we do about the backwards tests so
>> that we can still make allowed changes to the API w/o them breaking?
>>
>> * We can say that unit tests should not test package-private / @internal /
>> @experimental classes, but I don't believe in it.
>>
>> * We can re-introduce bw/src/java and ask all committers to make careful
>> changes to it. If we're careful, we won't introduce any *true* public API
>> break.
>>
>> The second is the more realistic solution IMO, but since this was the
>> situation in the past and changed to how it is today, I don't know if it's
>> acceptable.
>>
>> Whatever we do though, we cannot have backwards tests dictate what is
>> public API and what isn't, because bw tests are compiled following Java
>> semantics, that have nothing to do w/ Lucene's 'public' notion.
>>
>> Shai
>



-- 
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Phone: +7 (495) 683-567-4
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: API Semantics and Backwards

Posted by Shai Erera <se...@gmail.com>.
I've hit another backwards tests problem.

Over in LUCENE-2790 we've changed LogMergePolicy.useCompoundFile's behavior
to factor in the newly added noCFSRatio. After some discussion we decided
that even though it breaks back-compat's runtime behavior, it's ok in this
case because how Lucene manages the internal representation of segments
(compound or not) is up to it. And you can override it by disabling the
CFSRatio setting.

And indeed some tests failed (backwards as well as core stream) and the way
to fix them was to force CFS creation. However, on backwards this is not
doable because the tests are compiled against 3.0's source, where
setNoCFSRatio does not exist on LogMergePolicy, even though we agree that
this change is allowed back-compat wise.

I ended up fixing it by querying for the method using reflection and the
tests now pass.

Now, regardless of this change (whether it's ok or not), I think this shows
another problem with how we maintain backwards tests. Internal changes like
this, especially for @experimental / @internal classes are allowed, but we
need to revert to reflection hacks to resolve the tests.

So either we delete the offending tests, because like Uwe says - they
duplicate the test efforts, or we maintain a source for backwards.

I personally am in favor of removing all "non backwards" tests, and keep
those that do actually test backwards behavior. But I know the opinions are
divided here.

Shai

On Wed, Dec 1, 2010 at 4:48 PM, Shai Erera <se...@gmail.com> wrote:

> While I'm not against going back towards a checkout backwards that we can
> modify, I wonder if all the tests there should be there and how much do we
> actually duplicate.
>
> Lucene 3x should include all of 3.0 tests + new ones that test new
> functionality, or assert bug fixes etc. There shouldn't be a test in 3.0
> that does not exist in 3x, unless the missing test/feature was an agreed
> upon backwards break.
>
> So I think it would be really nice if backwards tested exactly what it
> should. For example, testing index format backcompat is done twice today, in
> "test-core" and "test-backwards", while it should only be run by backwards.
> There are a bunch of test classes I've created once that impl/extend
> 'search' related classes, for back-compat compilation only. They should also
> be run in backwards only.
>
> The downside of this is that maintenance is going to be difficult - it's
> much easier to copy tests over to backwards, then decide which ones should
> go there and which shouldn't. Also, adding new API requires a matching
> backwards test etc. Not non doable, but difficult - requires discipline.
>
> Shai
>
>
> On Tue, Nov 30, 2010 at 2:02 PM, Robert Muir <rc...@gmail.com> wrote:
>
>> On Tue, Nov 30, 2010 at 4:47 AM, Shai Erera <se...@gmail.com> wrote:
>> >
>> > Like you said, the rest of the tests just increase the test running
>> time.
>> >
>>
>> I'm not completely sure about this: do we always switch over our tests
>> to do the equivalent checks both against the new API and the old API
>> when we make API changes? There could be bugs in our 'backwards
>> handling' that are actually logic bugs that the new tests dont detect.
>>
>> So I'm a little concerned about only running "pure" simplistic API
>> tests in backwards.
>>
>> On the other hand, I'm really worried about what Shai brings up here:
>> we are doing some refactoring of the tests system and there is more
>> shared code at the moment: similar to MockRAMDirectory.
>> Because we worry about preventing things like index corruption, its my
>> opinion we need things like MockRAMDirectory, and they should be able
>> to break all the rules/etc (use pkg-private APIS) if we can prevent
>> bugs.
>> Just look at our trunk or 3.x tests and imagine them as backwards
>> tests... these sort of "utilities" like RandomIndexWriter will be more
>> fragile to internal/experimental/pkg-private changes, but as mentioned
>> above i think these are good to have in backwards tests.
>>
>> So, I think at the moment I'm leaning towards the idea of going back
>> towards a checkout that we can modify, in combination with us all
>> soliciting more reviews / longer time for any backports to stable
>> branches that require backwards tests modifications?  I understand
>> Uwe's point too, its dangerous to modify the code and seems to defeat
>> the purpose of backwards, but I think this is going to be a more
>> serious problem after releasing 3.1!
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
>

Re: API Semantics and Backwards

Posted by Shai Erera <se...@gmail.com>.
While I'm not against going back towards a checkout backwards that we can
modify, I wonder if all the tests there should be there and how much do we
actually duplicate.

Lucene 3x should include all of 3.0 tests + new ones that test new
functionality, or assert bug fixes etc. There shouldn't be a test in 3.0
that does not exist in 3x, unless the missing test/feature was an agreed
upon backwards break.

So I think it would be really nice if backwards tested exactly what it
should. For example, testing index format backcompat is done twice today, in
"test-core" and "test-backwards", while it should only be run by backwards.
There are a bunch of test classes I've created once that impl/extend
'search' related classes, for back-compat compilation only. They should also
be run in backwards only.

The downside of this is that maintenance is going to be difficult - it's
much easier to copy tests over to backwards, then decide which ones should
go there and which shouldn't. Also, adding new API requires a matching
backwards test etc. Not non doable, but difficult - requires discipline.

Shai

On Tue, Nov 30, 2010 at 2:02 PM, Robert Muir <rc...@gmail.com> wrote:

> On Tue, Nov 30, 2010 at 4:47 AM, Shai Erera <se...@gmail.com> wrote:
> >
> > Like you said, the rest of the tests just increase the test running time.
> >
>
> I'm not completely sure about this: do we always switch over our tests
> to do the equivalent checks both against the new API and the old API
> when we make API changes? There could be bugs in our 'backwards
> handling' that are actually logic bugs that the new tests dont detect.
>
> So I'm a little concerned about only running "pure" simplistic API
> tests in backwards.
>
> On the other hand, I'm really worried about what Shai brings up here:
> we are doing some refactoring of the tests system and there is more
> shared code at the moment: similar to MockRAMDirectory.
> Because we worry about preventing things like index corruption, its my
> opinion we need things like MockRAMDirectory, and they should be able
> to break all the rules/etc (use pkg-private APIS) if we can prevent
> bugs.
> Just look at our trunk or 3.x tests and imagine them as backwards
> tests... these sort of "utilities" like RandomIndexWriter will be more
> fragile to internal/experimental/pkg-private changes, but as mentioned
> above i think these are good to have in backwards tests.
>
> So, I think at the moment I'm leaning towards the idea of going back
> towards a checkout that we can modify, in combination with us all
> soliciting more reviews / longer time for any backports to stable
> branches that require backwards tests modifications?  I understand
> Uwe's point too, its dangerous to modify the code and seems to defeat
> the purpose of backwards, but I think this is going to be a more
> serious problem after releasing 3.1!
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

Re: API Semantics and Backwards

Posted by Robert Muir <rc...@gmail.com>.
On Tue, Nov 30, 2010 at 4:47 AM, Shai Erera <se...@gmail.com> wrote:
>
> Like you said, the rest of the tests just increase the test running time.
>

I'm not completely sure about this: do we always switch over our tests
to do the equivalent checks both against the new API and the old API
when we make API changes? There could be bugs in our 'backwards
handling' that are actually logic bugs that the new tests dont detect.

So I'm a little concerned about only running "pure" simplistic API
tests in backwards.

On the other hand, I'm really worried about what Shai brings up here:
we are doing some refactoring of the tests system and there is more
shared code at the moment: similar to MockRAMDirectory.
Because we worry about preventing things like index corruption, its my
opinion we need things like MockRAMDirectory, and they should be able
to break all the rules/etc (use pkg-private APIS) if we can prevent
bugs.
Just look at our trunk or 3.x tests and imagine them as backwards
tests... these sort of "utilities" like RandomIndexWriter will be more
fragile to internal/experimental/pkg-private changes, but as mentioned
above i think these are good to have in backwards tests.

So, I think at the moment I'm leaning towards the idea of going back
towards a checkout that we can modify, in combination with us all
soliciting more reviews / longer time for any backports to stable
branches that require backwards tests modifications?  I understand
Uwe's point too, its dangerous to modify the code and seems to defeat
the purpose of backwards, but I think this is going to be a more
serious problem after releasing 3.1!

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: API Semantics and Backwards

Posted by Shai Erera <se...@gmail.com>.
I realize the benefits of not storing the backwards source -- I don't care
too much about the size of the checkout, but more about it making
introducing bw breaks easier.

And that that it happened w/ MockRAMDir only so far, doesn't mean it won't
bite us somewhere else too. But it's not a good enough reason to create a
bw/src/java either.

It would be great if we can remove all the unrelated tests from backwards.
As I see it, we should have two types of tests - those that check that
*public* API hasn't changed, and this can be in the form of reflection or
simply creating classes that call/extend the public API. Also, we want to
have tests that assert *runtime* backwards support, such as for Tokenizers,
QueryParser etc. For those we can have special test cases that assert
exactly that.

Like you said, the rest of the tests just increase the test running time.

Shai

On Tue, Nov 30, 2010 at 9:50 AM, Uwe Schindler <uw...@thetaphi.de> wrote:

> Hi Shai,
>
>
>
> there are few comments:
>
>
>
> You are right with „we should not check backwards” for internal or **
> experimental** apis, the problem is that some tests use these internal
> APIs to check some internals. But as the idea behind backwards tests say: we
> test only drop in replacements. The functionality behind the tests is just
> nonsense, as the functionality is already tested by the main tests. So we
> just run the tests two times. Because of this, we can simply disable all
> tests that **explicitely** check internal APIs. An example would be a test
> that checks the exact class names of some returned objects (e.g. in
> MultiTermQuery rewrites).
>
>
>
> The problem are Mock classes in the backwards tests, that check internal
> apis (the famous example is MockRAMDirectory) as it is used by almost every
> test. If we would disable all these tests, we would not have the possibility
> to test any of them. For the current problem (and another one with this
> exact same class), we have a solution, I attached it to the issue. It’s a 10
> lines patch, it’s a hack, but its better than living with the cruft of
> having a modifiable backwards branch. If you really have to change these
> mock classes, you can do it like in the patch – but then you know you are
> doing something special and you can **mark** it as such (like I did in my
> patch).
>
>
>
> I am against reinserting the previous version’s classes for several
> reasons:
>
> -          The checkout gets big
>
> -          When we release a bugfix release of the previous version, we
> should be able to replace the old jar file by the bugfix one. I will sonn
> replace lucene-core-3.0.2.jar with 3.0.3.
>
> -          We should really don’t ever change the core test, because it
> contradicts the sense of backwards tests. If we really need to fix it (like
> for mock classes that are used by every test), it can be done in various
> ways: Remove the extra check code from the mock classes (this is often
> easiest) or use a reflection hack (we only have two of them now – but you
> changed the backwards branch much more often before I reverted it when
> adding the previous jar file).
>
> -          For tests that simply test some internal apis and nothing else
> important (for plugin compatibility): lets comment out the test. In the bw
> folder we did this with a special comment to leave the code intact.
>
>
>
> Uwe
>
>
>
> -----
>
> Uwe Schindler
>
> H.-H.-Meier-Allee 63, D-28213 Bremen
>
> http://www.thetaphi.de
>
> eMail: uwe@thetaphi.de
>
>
>
> *From:* Shai Erera [mailto:serera@gmail.com]
> *Sent:* Tuesday, November 30, 2010 7:46 AM
> *To:* dev@lucene.apache.org
> *Subject:* API Semantics and Backwards
>
>
>
> Hi
>
> I'd like to discuss the semantics of our API and how backwards tests relate
> to it. First, I'd like to confirm my understanding - currently it relates to
> 3x, but it will apply to 4x after 4.0 will be released:
>
> Public/Protected -- this API is 'public' and we should maintain
> back-compat, in the form of jar drop-in. That is, we cannot rename or modify
> it, w/o deprecating first (I leave *exceptions* deliberately outside the
> discussion).
>
> Package-private -- this is not public API and while users can use it if
> they declare their classes under the relevant package, they should not
> expect jar drop-in support.
>
> Public @internal -- this is public API following Java language, however not
> public to the users. We need to make this API public so that Lucene can
> access it, but it's used for internal purposes only. Users can still use it,
> however cannot expect jar drop-in support.
>
> Public @experimental -- this API is intended to be made 'public' one day,
> however we're still working on it, and even though it's checked-in or even
> released, it may change unexpectedly. Not sure we want to say that jar
> drop-in support cannot be expected, though according to the definition we
> are allowed to change it ... so perhaps it's like @internal, only w/ the
> intention to make it public one day.
>
> Both @internal and @experimental tags should be removed if they do not
> apply anymore.
>
> Now comes the question about backwards tests -- our tests touch all the API
> types above, however they are not resilient to changes in 3 out of 4 of
> them. In the past this wasn't a problem - the backwards layer had both
> src/java and src/test, tests we compiled against src/java and then executed
> against core.jar. This allowed changing the source code of the "non public"
> API and make the same changes to backwards/src/java, and tests would still
> run. This had a disadvantage too - it was 'easier' to break back-compat on
> the first API type (the *true* public API) because you could still change
> bw/src/java and be done w/ it.
>
> Today though bw tests are compiled against the previous release source. But
> if you make changes to the non public API, they break while they shouldn't.
> So the question is what can we do about the backwards tests so that we can
> still make allowed changes to the API w/o them breaking?
>
> * We can say that unit tests should not test package-private / @internal /
> @experimental classes, but I don't believe in it.
>
> * We can re-introduce bw/src/java and ask all committers to make careful
> changes to it. If we're careful, we won't introduce any *true* public API
> break.
>
> The second is the more realistic solution IMO, but since this was the
> situation in the past and changed to how it is today, I don't know if it's
> acceptable.
>
> Whatever we do though, we cannot have backwards tests dictate what is
> public API and what isn't, because bw tests are compiled following Java
> semantics, that have nothing to do w/ Lucene's 'public' notion.
>
> Shai
>

RE: API Semantics and Backwards

Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi Shai,

 

there are few comments:

 

You are right with "we should not check backwards" for internal or
*experimental* apis, the problem is that some tests use these internal APIs
to check some internals. But as the idea behind backwards tests say: we test
only drop in replacements. The functionality behind the tests is just
nonsense, as the functionality is already tested by the main tests. So we
just run the tests two times. Because of this, we can simply disable all
tests that *explicitely* check internal APIs. An example would be a test
that checks the exact class names of some returned objects (e.g. in
MultiTermQuery rewrites).

 

The problem are Mock classes in the backwards tests, that check internal
apis (the famous example is MockRAMDirectory) as it is used by almost every
test. If we would disable all these tests, we would not have the possibility
to test any of them. For the current problem (and another one with this
exact same class), we have a solution, I attached it to the issue. It's a 10
lines patch, it's a hack, but its better than living with the cruft of
having a modifiable backwards branch. If you really have to change these
mock classes, you can do it like in the patch - but then you know you are
doing something special and you can *mark* it as such (like I did in my
patch).

 

I am against reinserting the previous version's classes for several reasons:

-          The checkout gets big

-          When we release a bugfix release of the previous version, we
should be able to replace the old jar file by the bugfix one. I will sonn
replace lucene-core-3.0.2.jar with 3.0.3.

-          We should really don't ever change the core test, because it
contradicts the sense of backwards tests. If we really need to fix it (like
for mock classes that are used by every test), it can be done in various
ways: Remove the extra check code from the mock classes (this is often
easiest) or use a reflection hack (we only have two of them now - but you
changed the backwards branch much more often before I reverted it when
adding the previous jar file).

-          For tests that simply test some internal apis and nothing else
important (for plugin compatibility): lets comment out the test. In the bw
folder we did this with a special comment to leave the code intact.

 

Uwe

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

 <http://www.thetaphi.de/> http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

From: Shai Erera [mailto:serera@gmail.com] 
Sent: Tuesday, November 30, 2010 7:46 AM
To: dev@lucene.apache.org
Subject: API Semantics and Backwards

 

Hi

I'd like to discuss the semantics of our API and how backwards tests relate
to it. First, I'd like to confirm my understanding - currently it relates to
3x, but it will apply to 4x after 4.0 will be released:

Public/Protected -- this API is 'public' and we should maintain back-compat,
in the form of jar drop-in. That is, we cannot rename or modify it, w/o
deprecating first (I leave *exceptions* deliberately outside the
discussion).

Package-private -- this is not public API and while users can use it if they
declare their classes under the relevant package, they should not expect jar
drop-in support.

Public @internal -- this is public API following Java language, however not
public to the users. We need to make this API public so that Lucene can
access it, but it's used for internal purposes only. Users can still use it,
however cannot expect jar drop-in support.

Public @experimental -- this API is intended to be made 'public' one day,
however we're still working on it, and even though it's checked-in or even
released, it may change unexpectedly. Not sure we want to say that jar
drop-in support cannot be expected, though according to the definition we
are allowed to change it ... so perhaps it's like @internal, only w/ the
intention to make it public one day.

Both @internal and @experimental tags should be removed if they do not apply
anymore.

Now comes the question about backwards tests -- our tests touch all the API
types above, however they are not resilient to changes in 3 out of 4 of
them. In the past this wasn't a problem - the backwards layer had both
src/java and src/test, tests we compiled against src/java and then executed
against core.jar. This allowed changing the source code of the "non public"
API and make the same changes to backwards/src/java, and tests would still
run. This had a disadvantage too - it was 'easier' to break back-compat on
the first API type (the *true* public API) because you could still change
bw/src/java and be done w/ it.

Today though bw tests are compiled against the previous release source. But
if you make changes to the non public API, they break while they shouldn't.
So the question is what can we do about the backwards tests so that we can
still make allowed changes to the API w/o them breaking? 

* We can say that unit tests should not test package-private / @internal /
@experimental classes, but I don't believe in it.

* We can re-introduce bw/src/java and ask all committers to make careful
changes to it. If we're careful, we won't introduce any *true* public API
break.

The second is the more realistic solution IMO, but since this was the
situation in the past and changed to how it is today, I don't know if it's
acceptable.

Whatever we do though, we cannot have backwards tests dictate what is public
API and what isn't, because bw tests are compiled following Java semantics,
that have nothing to do w/ Lucene's 'public' notion.

Shai