You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by Myrna van Lunteren <m....@gmail.com> on 2005/10/06 20:00:59 UTC

proposed modification current test harness (Re: DERBY-575)

Hi,
 As a result of Dan's comments in reference to DERBY-575,
http://mail-archives.apache.org/mod_mbox/db-derby-dev/200509.mbox/%3c339246018.1127141188732.JavaMail.jira@ajax.apache.org%3e
I am proposing the following modification to the current test harness:
 - any read of .properties files and the like will be in original encoding
 effectively, this means to add 'ISO-8859-1' in a number of places where a
InputStreamReader is used in the test harness classes.
- the .out file will be copied into the local encoding and this will be used
in the diff.
 I suggest giving this copied master file extension .tmpmstr - which is what
happens with networkserver tests.
 Code may need to be added to remove the copied master if the test passes.
 This approach has the following benefit:
 - on a non-ASCII system like zOS the tests can still be run without
requiring all text files to be converted first, but the generated ouptput
and .diff can still be looked at and compared with expected output by a
human
 - the expected output is right there to compare with for a human
investigating a failure, even when you're running with jars. Note that
networkserver already copies the expected output.
 - I am also wondering if this might get around harness bug DERBY-244.
 I'd like to find out if anyone is opposed to this idea.
I know we'd like to get JUnit tests going, but I think this is still a
worthwhile change.
 Thx,
Myrna

Re: proposed modification current test harness (Re: DERBY-575)

Posted by Rajesh Kartha <ka...@Source-Zone.Org>.
Kathey Marsden wrote:

>Daniel John Debrunner wrote:
>[snip conversation about how to
>
>  
>
>> 
>>
>>the counter is with your solution how do they generate a
>>master file suitable for submitting as part of a contribution? I'm not
>>sure there is a good solution.
>>
>> 
>>
>>    
>>
>I see nothing wrong with leaving test development on OS390 as another
>fish to fry,  and  I think in general Myrna's plan is a good one if I
>understand it correctly.   Generally we agree on an encoding for master
>files, sql files etc and always check in files in that encoding.  Then
>the corresponding files under the test output directory are always in
>the native encoding making tests easy to run and failures easy to
>diagnose. 
>
>To me what is a little fuzzy to me is the agreed input encoding for the
>sql and master files, especially for languages like Japanese that are
>not going to conform to ISO 8859-1.  What would seem to make sense to me
>would be ISO 8859-1 with escape sequences like the common  property file
>format  in Java
>(http://java.sun.com/j2se/1.4.2/docs/api/java/util/Properties.html#load(java.io.InputStream). 
>That way you could run native2ascii on whatever platform you were
>working with if you had special  characters in the text to get the
>desired master file.  Does specifying an input encoding  of ISO 8859-1 
>read in the escaped  sequences properly, or do you need to specify
>something else? 
>
>Kathey
>
>
>  
>
I was thinking, for test development on platforms like OS/390 etc. - 
that has different encoding, wouldn't  it help if the user adds the 
master encoding  information
as one of the test properties (for example: a property like  - 
derby.test.master.encoding). He/she should be able to then submit the 
master (in its orginal encoding)
without any need for conversion. The harness can  get this property and 
use it to read the master and if needed convert them to 'UTF-8' to be 
readable
on regular platforms or skip the test. I would think the same will work 
for  files in Korean, Japanese etc.

Something like:
http://java.sun.com/docs/books/tutorial/i18n/text/stream.html

For the reasons pointed out by Myrna, I would agree the harness needs to 
convert the master (from the testing jar) to the local encoding for the 
users to understand
the failures. I would also like to add that instead of writing the 
converted files to the local disk, if the master can the converted to 
the local encoding using
streams and be kept in the memory (streams).  The following snippet uses 
'*UTF-8*' .

For example: on  zOS


                    //read the default master file from the Derby 
testing jar

                    InputStream aStream2 = new FileInputStream(infile2);

                      //UTF-8 being used as default encoding for reading 
the master file.

                       Reader aReader2 = new 
InputStreamReader(aStream2,"UTF-8");
                        BufferedReader br2=new BufferedReader(aReader2);

                     //convert to local encoding while writing to the 
outputstream

                        ByteArrayOutputStream bao=new 
ByteArrayOutputStream();
                        PrintWriter eWriter = new PrintWriter(bao);
                        String s=null;
                        while((s = br2.readLine())!=null) {
                               eWriter.println(s);
                        }
                        eWriter.close();

                     //get the new converted input stream

                       ByteArrayInputStream bio=new 
ByteArrayInputStream(bao.toByteArray());
                        aReader2 = new InputStreamReader(bio);
                        br2=new BufferedReader(aReader2);
                    
                     //do comparision between the newly converted master 
and the actual output

This will avoid the writing of the converted files to the disk and hence 
the need for cleanup as originally suggested.
Moreover this conversion needs to* happen only if *the 
System.getPropertty ("file.encoding") is of a special type, say 
Cp1047(zOS) etc.

My 2 cents,

-Rajesh







Re: proposed modification current test harness (Re: DERBY-575)

Posted by Kathey Marsden <km...@sbcglobal.net>.
Daniel John Debrunner wrote:
[snip conversation about how to

>  
>
>the counter is with your solution how do they generate a
>master file suitable for submitting as part of a contribution? I'm not
>sure there is a good solution.
>
>  
>
I see nothing wrong with leaving test development on OS390 as another
fish to fry,  and  I think in general Myrna's plan is a good one if I
understand it correctly.   Generally we agree on an encoding for master
files, sql files etc and always check in files in that encoding.  Then
the corresponding files under the test output directory are always in
the native encoding making tests easy to run and failures easy to
diagnose. 

To me what is a little fuzzy to me is the agreed input encoding for the
sql and master files, especially for languages like Japanese that are
not going to conform to ISO 8859-1.  What would seem to make sense to me
would be ISO 8859-1 with escape sequences like the common  property file
format  in Java
(http://java.sun.com/j2se/1.4.2/docs/api/java/util/Properties.html#load(java.io.InputStream). 
That way you could run native2ascii on whatever platform you were
working with if you had special  characters in the text to get the
desired master file.  Does specifying an input encoding  of ISO 8859-1 
read in the escaped  sequences properly, or do you need to specify
something else? 

Kathey



Re: proposed modification current test harness (Re: DERBY-575)

Posted by Daniel John Debrunner <dj...@debrunners.com>.
Myrna van Lunteren wrote:
> On 10/6/05, *Daniel John Debrunner* <djd@debrunners.com
> <ma...@debrunners.com>> wrote:
> 
>     Myrna van Lunteren wrote:
> 
>     > Hi,
>     >
>     > As a result of Dan's comments in reference to DERBY-575,
>     >
>     http://mail-archives.apache.org/mod_mbox/db-derby-dev/200509.mbox/%3c339246018.1127141188732.JavaMail.jira@ajax.apache.org%3e
>     <http://mail-archives.apache.org/mod_mbox/db-derby-dev/200509.mbox/%3c339246018.1127141188732.JavaMail.jira@ajax.apache.org%3e>
>     >
>     <http://mail-archives.apache.org/mod_mbox/db-derby-dev/200509.mbox/%3c339246018.1127141188732.JavaMail.jira@ajax.apache.org%3e
>     <http://mail-archives.apache.org/mod_mbox/db-derby-dev/200509.mbox/%3c339246018.1127141188732.JavaMail.jira@ajax.apache.org%3e>>
>     > I am proposing the following modification to the current test harness:
>     >
>     > - any read of .properties files and the like will be in original
>     encoding
>     >     effectively, this means to add 'ISO-8859-1' in a number of places
>     > where a InputStreamReader is used in the test harness classes.
>     > - the .out file will be copied into the local encoding and this
>     will be
>     > used in the diff.
>     >     I suggest giving this copied master file extension .tmpmstr -
>     which
>     > is what happens with networkserver tests.
>     >     Code may need to be added to remove the copied master if the test
>     > passes.
>     >
>     > This approach has the following benefit:
>     >     - on a non-ASCII system like zOS the tests can still be run
>     without
>     > requiring all text files to be converted first, but the generated
>     > ouptput and .diff can still be looked at and compared with expected
>     > output by a human
>     >     - the expected output is right there to compare with for a human
>     > investigating a failure, even when you're running with jars. Note that
>     > networkserver already copies the expected output.
>     >     - I am also wondering if this might get around harness bug
>     DERBY-244.
> 
> 
>     There is the problem of test development on such a platform, either for
>     new tests (and test cases) or for platform specific masters. Your scheme
>     will result in the master output being in an encoding that cannot be
>     checked into the codeline, or copied into them master directory, since
>     it is not in ISO-8859-1.
> 
>     Maybe it can be worked around by the developer switching between using
>     ISO-8859-1 and the default on that platform.
> 
>     The other risk is that a cannon will be checked into the master
>     directory that is not ISO-8859-1 encoding, since someone is runing tests
>     on such a platform.
> 
>     Dan.
> 
>  
> Thx for your input...
>  
> Maybe I'm wasn't clear on this or quite possibly I'm missing
> something...But let's take what I've been doing as an example. We start
> with the jars only. Assuming the properties files get read in ASCII (so
> the harness knows which tests to run for a suite, currently that fails),
> the harness then produces test output. The current output of the tests
> is readable on zOS. The masters however are in ASCII format, since
> they're part of a jar file. To diff those two, would mean we'd have to
> convert one or the other.
>  
> If we'd *not* make the masters files in an encoding that is readable to
> the person, then we'd need to write the output out in ASCII also, which
> is unreadable to a person. So, how could someone developing a test on
> such a system ever decide that the output is correct?

Not sure, the counter is with your solution how do they generate a
master file suitable for submitting as part of a contribution? I'm not
sure there is a good solution.

> Also, if we'd really gain a person developing on zOS, I'm assuming
> they'd also have svn on zOs. Then, wouldn't svn's eol-style native take
> care of the conversion?

No, svn eol-style only changes the end of line handling, not the content
of the file. Also this is not a problem specific to z/os, it's any
platform where the default encoding does not match the master file. More
specifically where the encoding much the same as the expected one, but
has differences.

Dan.


Re: proposed modification current test harness (Re: DERBY-575)

Posted by Myrna van Lunteren <m....@gmail.com>.
On 10/6/05, Daniel John Debrunner <dj...@debrunners.com> wrote:

> Myrna van Lunteren wrote:
>
> > Hi,
> >
> > As a result of Dan's comments in reference to DERBY-575,
> > http://mail-archives.apache.org/mod_mbox/db-derby-dev/200509.mbox/%3c339246018.1127141188732.JavaMail.jira@ajax.apache.org%3e
>
> > <http://mail-archives.apache.org/mod_mbox/db-derby-dev/200509.mbox/%3c339246018.1127141188732.JavaMail.jira@ajax.apache.org%3e
> >
> > I am proposing the following modification to the current test harness:
> >
> > - any read of .properties files and the like will be in original
> encoding
> > effectively, this means to add 'ISO-8859-1' in a number of places
> > where a InputStreamReader is used in the test harness classes.
> > - the .out file will be copied into the local encoding and this will be
> > used in the diff.
> > I suggest giving this copied master file extension .tmpmstr - which
> > is what happens with networkserver tests.
> > Code may need to be added to remove the copied master if the test
> > passes.
> >
> > This approach has the following benefit:
> > - on a non-ASCII system like zOS the tests can still be run without
> > requiring all text files to be converted first, but the generated
> > ouptput and .diff can still be looked at and compared with expected
> > output by a human
> > - the expected output is right there to compare with for a human
> > investigating a failure, even when you're running with jars. Note that
> > networkserver already copies the expected output.
> > - I am also wondering if this might get around harness bug DERBY-244.
>
>
> There is the problem of test development on such a platform, either for
> new tests (and test cases) or for platform specific masters. Your scheme
> will result in the master output being in an encoding that cannot be
> checked into the codeline, or copied into them master directory, since
> it is not in ISO-8859-1.
>
> Maybe it can be worked around by the developer switching between using
> ISO-8859-1 and the default on that platform.
>
> The other risk is that a cannon will be checked into the master
> directory that is not ISO-8859-1 encoding, since someone is runing tests
> on such a platform.
>
> Dan.

 Thx for your input...
 Maybe I'm wasn't clear on this or quite possibly I'm missing
something...But let's take what I've been doing as an example. We start with
the jars only. Assuming the properties files get read in ASCII (so the
harness knows which tests to run for a suite, currently that fails), the
harness then produces test output. The current output of the tests is
readable on zOS. The masters however are in ASCII format, since they're part
of a jar file. To diff those two, would mean we'd have to convert one or the
other.
 If we'd *not* make the masters files in an encoding that is readable to the
person, then we'd need to write the output out in ASCII also, which is
unreadable to a person. So, how could someone developing a test on such a
system ever decide that the output is correct?
  Also, if we'd really gain a person developing on zOS, I'm assuming they'd
also have svn on zOs. Then, wouldn't svn's eol-style native take care of the
conversion?
 Note, that currently the test harness doesn't at all handle/differentiate
os specifics - we've only jvm specific masters. Doing further os-platform
specifics is worm-farming we've managed to stay out of - so far. Most of the
os-platform-specific differences have been fairly minor (slightly different
error messages coming from the jvms) that are usually sed-ed out by test
specific sed files. DERBY-244 that I mentioned is the only issue that I've
been fearing would need os-specific canon handling.
 Myrna

Re: proposed modification current test harness (Re: DERBY-575)

Posted by Daniel John Debrunner <dj...@debrunners.com>.
Myrna van Lunteren wrote:

> Hi,
>  
> As a result of Dan's comments in reference to DERBY-575,
> http://mail-archives.apache.org/mod_mbox/db-derby-dev/200509.mbox/%3c339246018.1127141188732.JavaMail.jira@ajax.apache.org%3e
> <http://mail-archives.apache.org/mod_mbox/db-derby-dev/200509.mbox/%3c339246018.1127141188732.JavaMail.jira@ajax.apache.org%3e>
> I am proposing the following modification to the current test harness:
>  
> - any read of .properties files and the like will be in original encoding
>     effectively, this means to add 'ISO-8859-1' in a number of places
> where a InputStreamReader is used in the test harness classes.
> - the .out file will be copied into the local encoding and this will be
> used in the diff.
>     I suggest giving this copied master file extension .tmpmstr - which
> is what happens with networkserver tests.
>     Code may need to be added to remove the copied master if the test
> passes.
>  
> This approach has the following benefit:
>     - on a non-ASCII system like zOS the tests can still be run without
> requiring all text files to be converted first, but the generated
> ouptput and .diff can still be looked at and compared with expected
> output by a human
>     - the expected output is right there to compare with for a human
> investigating a failure, even when you're running with jars. Note that
> networkserver already copies the expected output.
>     - I am also wondering if this might get around harness bug DERBY-244.


There is the problem of test development on such a platform, either for
new tests (and test cases) or for platform specific masters. Your scheme
will result in the master output being in an encoding that cannot be
checked into the codeline, or copied into them master directory, since
it is not in ISO-8859-1.

Maybe it can be worked around by the developer switching between using
ISO-8859-1 and the default on that platform.

The other risk is that a cannon will be checked into the master
directory that is not ISO-8859-1 encoding, since someone is runing tests
on such a platform.

Dan.