You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by Andrew McIntyre <mc...@gmail.com> on 2006/08/10 12:04:04 UTC

test failures caused by firewall software? (was Re: Regression Test Failure! - Derby 430109 - Sun DBTG)

Failures in these tests have been arriving ever since these test
results started being sent to this list. Synapses connected with
regard to a recent problem I had running tests on a specific machine.
So, I'm wondering...

On 8/10/06, Ole.Solberg@sun.com <Ol...@sun.com> wrote:
> [Auto-generated mail]
>
> </snip lots of different results on different machines>

I recently had some problems running some tests on a machine due to
overaggressive firewall software on that machine. Considering that
these mails involve tests running on different operating systems
presumably running on different machines, and that a lot of them
involve access to network resources, is there a possibility that the
differences in these test runs are due to various firewall or network
access controls on these machines?

e.g., in the JDK 1.4 runs above, the Linux tests pass completely,
while there are two failures in the Cygwin environment:

a) derbynetmats/NSinSameJVM.diff, failure is "FAIL Network Server did not start"

or, is it not reachable because of overaggressive firewall software on
the machine preventing access?

b) derbynetmats/DerbyNet/multi/stress.diff

No clear idea on this one, maybe failure to communicate to the client
threads that they should shut down?

For the 1.4 SunOS tests in the same run, 9 errors in derbynetmats, do
we need to grant permissions in our policy file for db2jcc.jar? e.g.
java.net.SocketPermission read,write? Or are the ports used by the
tests being blocked at the OS level? This error makes me especially
suspicious:

>   Could not access database through the network server.
>     java.net.ConnectException : Error opening socket to server xxxFILTERED_HOSTNAMExxx on port 31415 with message : Connection refused
6 del

refused why?

Now, the 1.5 tests:

Cygwin:
NSinSameJVM, same as above.
dcl/encryption_key/encryptDatabaseTest3, overaggressive firewall
software blocking URLs with subsubsubprotocols, maybe? The problem in
each of these cases is with URLs starting with 'jdbc:derby:jar' or
'jdbc:derby:classpath'.
upgradeTest, I think the property that points to the older Derby jars
is not being passed in properly.

SunOS 10-x86:
derbynetmats/ShutDownDBWhenNSShutsDownTest.junit, communication blocked
derbynetmats/sysinfo_withproperties.java, test policy problem?
permissions not granted to db2jcc.jar in test policy?
derbynetmats/testconnection.java, communication blocked
derbynetclientmats/parameterMapping.java, communication blocked
derbynetclientmats/xaSimplePositive.java, not sure about this one, but
it looks like the first real read from on the client side, maybe
communication blocked on the client end?

SunOS 9 / 11:
procedureInTrigger looks like a recent trigger problem, not sure why
it only shows up here.

It would be great if we could figure out why we constantly get these
reminders of why these tests only fail in these specific environments,
despite efforts to reproduce them in similar environments elsewhere.

HTH,
andrew

Re: test failures caused by firewall software? (was Re: Regression Test Failure! - Derby 430109 - Sun DBTG)

Posted by Ole Solberg <Ol...@Sun.COM>.
Kathey Marsden wrote:
> Ole Solberg wrote:
> 
>>>
>>> I recently had some problems running some tests on a machine due to
>>> overaggressive firewall software on that machine. Considering that
>>> these mails involve tests running on different operating systems
>>> presumably running on different machines, and that a lot of them
>>> involve access to network resources, is there a possibility that the
>>> differences in these test runs are due to various firewall or network
>>> access controls on these machines?
>>
>>
>>
>> Thanks Andrew, for the hint!
>> The 'CYGWIN_NT-5.1_i686-unknown' machine (on my home network) runs 
>> firewall sw. I'll try turning that off.
>>
After disabling the Symantec Client Firewall Version 5.1  the test passes!


> Just FYI,  I think there is something  not quite right with network 
> server's interaction with firewall software or maybe , where one can get 
> a hang even though the software is set to allow connections.   I have 
> not been able to reproduce it reliably.  If you can it would be worth 
> investigating.  I know Bryan expressed interest in this issue.
> 
> The history is that with  the following firewall software:
> 
> Check Point Integrity Flex version 6.01.182.000
> TrueVector security engine version 6.0.182.000
> Driver version 6.0.182.0000
> 
> On Windows XP machine, NSInSameJVM sometimes hangs and does not pop up 
> with a window requesting access, but I cannot  reproduce reliably.  
> Other times, the window would pops up so I can allow access but the test 
> will not proceed.
> 
> I had a user report on Windows 2003 that a simple network server/client 
> program would hang with the same software.
> The  ultimate resolution of this issue was that Integrity Flex was not 
> supported on Windows 2003.    But, before we got to that point we saw 
> that just a simple attempt to make a client connection  through ij would 
> hang.  Using a small java program to just simulate the client/server 
> socket interactions, I could not reproduce.
> 
> My gut feeling was that perhaps something in network server was not 
> getting flushed or something else is just not quite right with the 
> socket interaction in network server/client, but again never got close 
> enough to a reproducible case to really track it down.
> 
> Another clue with regard to NSInSameJVM is  DERBY-589.  This is reported 
> not to reproduce with jdk 1.5, but again seems to illustrate a 
> fragility  of network server or NSINSameJVM that can cause a hang.
> 
> Anyway, short story, if you can reproduce this reliably, before you 
> disable your firewall software take a few minutes to document how to 
> reproduce this elusive problem and file a bug and  point to this thread. 
> Kathey
> 

I updated https://issues.apache.org/jira/browse/DERBY-952 .


-- 
Ole Solberg, Database Technology Group,
Sun Microsystems, Trondheim, Norway

Re: test failures caused by firewall software? (was Re: Regression Test Failure! - Derby 430109 - Sun DBTG)

Posted by Daniel John Debrunner <dj...@apache.org>.
Kathey Marsden wrote:

> Just FYI,  I think there is something  not quite right with network
> server's interaction with firewall software or maybe , where one can get
> a hang even though the software is set to allow connections.

FYI - I'm seeing hangs in some of the tests when a wait for the server
to be booted by pinging it fails. The test then hangs in trying to
cleanup, rather than any hang in the network server. I see this with
testSecMec and testProperties. I can get these tests to pass by bumping
the wait time. Of course it's interesting that each test seems to have
its own method to perform the wait and its own time to wait. I'm not
sure I should be bumping the time for these tests, but I will commit the
clean up for the tests. Maybe the increase in time i just due ot my
machine or it points to a real problem with the server.

Dan.



Re: test failures caused by firewall software? (was Re: Regression Test Failure! - Derby 430109 - Sun DBTG)

Posted by Daniel John Debrunner <dj...@apache.org>.
Daniel John Debrunner wrote:

> Kathey Marsden wrote:
> 
> 
>>Kathey Marsden wrote:
>>
>>
>>>Just FYI,  I think there is something  not quite right with network
>>>server's interaction with firewall software or maybe , where one can
>>>get a hang even though the software is set to allow connections. 
>>
>>
>>Should have said:
>>
>>Just FYI,  I think there is something  not quite right with network
>>server's interaction with firewall software or maybe the tests , where
>>one can get a hang even though the software is set to allow connections.
>>Also DERBY-1278 looks like another similar hang with testSecMec on Z/OS.
> 
> 
> FYI - I'm seeing a hang running derbyNet/testProperties test. I tried
> without the firewall software and still see it.

So this hang is a test problem, not one with the network server. The
testProperties.execCmd() is used to fork a JVM and not handle its
streams. This will cause problems, as indicated by the javadoc for Process.

"The parent process uses these streams to feed input to and get output
from the subprocess. Because some native platforms only provide limited
buffer size for standard input and output streams, failure to promptly
write the input stream or read the output stream of the subprocess may
cause the subprocess to block, and even deadlock."

Making the hack change to always process the streams to System.out made
the hang disappear, though of course introduced more output, I'm not
sure how stable it is though.

Dan.


Re: test failures caused by firewall software? (was Re: Regression Test Failure! - Derby 430109 - Sun DBTG)

Posted by Daniel John Debrunner <dj...@apache.org>.
Kathey Marsden wrote:

> Kathey Marsden wrote:
> 
>> Just FYI,  I think there is something  not quite right with network
>> server's interaction with firewall software or maybe , where one can
>> get a hang even though the software is set to allow connections. 
> 
> 
> Should have said:
> 
> Just FYI,  I think there is something  not quite right with network
> server's interaction with firewall software or maybe the tests , where
> one can get a hang even though the software is set to allow connections.
> Also DERBY-1278 looks like another similar hang with testSecMec on Z/OS.

FYI - I'm seeing a hang running derbyNet/testProperties test. I tried
without the firewall software and still see it.

Dan.


Re: test failures caused by firewall software? (was Re: Regression Test Failure! - Derby 430109 - Sun DBTG)

Posted by Kathey Marsden <km...@sbcglobal.net>.
Kathey Marsden wrote:

> Just FYI,  I think there is something  not quite right with network 
> server's interaction with firewall software or maybe , where one can 
> get a hang even though the software is set to allow connections. 

Should have said:

Just FYI,  I think there is something  not quite right with network 
server's interaction with firewall software or maybe the tests , where 
one can get a hang even though the software is set to allow connections. 

Also DERBY-1278 looks like another similar hang with testSecMec on Z/OS.



Re: test failures caused by firewall software? (was Re: Regression Test Failure! - Derby 430109 - Sun DBTG)

Posted by Kathey Marsden <km...@sbcglobal.net>.
Ole Solberg wrote:

>>
>> I recently had some problems running some tests on a machine due to
>> overaggressive firewall software on that machine. Considering that
>> these mails involve tests running on different operating systems
>> presumably running on different machines, and that a lot of them
>> involve access to network resources, is there a possibility that the
>> differences in these test runs are due to various firewall or network
>> access controls on these machines?
>
>
> Thanks Andrew, for the hint!
> The 'CYGWIN_NT-5.1_i686-unknown' machine (on my home network) runs 
> firewall sw. I'll try turning that off.
>
Just FYI,  I think there is something  not quite right with network 
server's interaction with firewall software or maybe , where one can get 
a hang even though the software is set to allow connections.   I have 
not been able to reproduce it reliably.  If you can it would be worth 
investigating.  I know Bryan expressed interest in this issue.

The history is that with  the following firewall software:

Check Point Integrity Flex version 6.01.182.000
TrueVector security engine version 6.0.182.000
Driver version 6.0.182.0000

On Windows XP machine, NSInSameJVM sometimes hangs and does not pop up 
with a window requesting access, but I cannot  reproduce reliably.  
Other times, the window would pops up so I can allow access but the test 
will not proceed.

I had a user report on Windows 2003 that a simple network server/client 
program would hang with the same software.
The  ultimate resolution of this issue was that Integrity Flex was not 
supported on Windows 2003.    But, before we got to that point we saw 
that just a simple attempt to make a client connection  through ij would 
hang.  Using a small java program to just simulate the client/server 
socket interactions, I could not reproduce.

My gut feeling was that perhaps something in network server was not 
getting flushed or something else is just not quite right with the 
socket interaction in network server/client, but again never got close 
enough to a reproducible case to really track it down.

Another clue with regard to NSInSameJVM is  DERBY-589.  This is reported 
not to reproduce with jdk 1.5, but again seems to illustrate a 
fragility  of network server or NSINSameJVM that can cause a hang.

Anyway, short story, if you can reproduce this reliably, before you 
disable your firewall software take a few minutes to document how to 
reproduce this elusive problem and file a bug and  point to this thread.  

Kathey




















Re: test failures caused by firewall software? (was Re: Regression Test Failure! - Derby 430109 - Sun DBTG)

Posted by Ole Solberg <Ol...@Sun.COM>.
Andrew McIntyre wrote:
> Failures in these tests have been arriving ever since these test
> results started being sent to this list. Synapses connected with
> regard to a recent problem I had running tests on a specific machine.
> So, I'm wondering...
> 
> On 8/10/06, Ole.Solberg@sun.com <Ol...@sun.com> wrote:
> 
>> [Auto-generated mail]
>>
>> </snip lots of different results on different machines>
> 
> 
> I recently had some problems running some tests on a machine due to
> overaggressive firewall software on that machine. Considering that
> these mails involve tests running on different operating systems
> presumably running on different machines, and that a lot of them
> involve access to network resources, is there a possibility that the
> differences in these test runs are due to various firewall or network
> access controls on these machines?

Thanks Andrew, for the hint!
The 'CYGWIN_NT-5.1_i686-unknown' machine (on my home network) runs 
firewall sw. I'll try turning that off.


> 
> e.g., in the JDK 1.4 runs above, the Linux tests pass completely,
> while there are two failures in the Cygwin environment:
> 
> a) derbynetmats/NSinSameJVM.diff, failure is "FAIL Network Server did 
> not start"
> 
> or, is it not reachable because of overaggressive firewall software on
> the machine preventing access?
> 
> b) derbynetmats/DerbyNet/multi/stress.diff
> 
> No clear idea on this one, maybe failure to communicate to the client
> threads that they should shut down?
> 
> For the 1.4 SunOS tests in the same run, 9 errors in derbynetmats, do
> we need to grant permissions in our policy file for db2jcc.jar? e.g.
> java.net.SocketPermission read,write? Or are the ports used by the
> tests being blocked at the OS level? This error makes me especially
> suspicious:
> 
>>   Could not access database through the network server.
>>     java.net.ConnectException : Error opening socket to server 
>> xxxFILTERED_HOSTNAMExxx on port 31415 with message : Connection refused
> 
> 6 del
> 
> refused why?
> 
> Now, the 1.5 tests:
> 
> Cygwin:
> NSinSameJVM, same as above.
('CYGWIN_NT-5.1_i686-unknown')

> dcl/encryption_key/encryptDatabaseTest3, overaggressive firewall
> software blocking URLs with subsubsubprotocols, maybe? The problem in
> each of these cases is with URLs starting with 'jdbc:derby:jar' or
> 'jdbc:derby:classpath'.
> upgradeTest, I think the property that points to the older Derby jars
> is not being passed in properly.
These are on the 'CYGWIN_NT-5.2_i686-unknown' (on our lab network) - no 
firewall sw.
The upgrade test should be fixed on next run....

> 
> SunOS 10-x86:
> derbynetmats/ShutDownDBWhenNSShutsDownTest.junit, communication blocked
> derbynetmats/sysinfo_withproperties.java, test policy problem?
> permissions not granted to db2jcc.jar in test policy?
> derbynetmats/testconnection.java, communication blocked
> derbynetclientmats/parameterMapping.java, communication blocked
> derbynetclientmats/xaSimplePositive.java, not sure about this one, but
> it looks like the first real read from on the client side, maybe
> communication blocked on the client end?
Also on our lab network, as are the other SunOS 5.9/5.10 and Linux-2.6.9 
machines.

> 
> SunOS 9 / 11:
> procedureInTrigger looks like a recent trigger problem, not sure why
> it only shows up here.
(SunOS-5.11 on my home network)


For test machines on our lab network I build on SunOS 5.10 x86.
For machines on my home network svn update, build and test is done on 
each machine.

> 
> It would be great if we could figure out why we constantly get these
> reminders of why these tests only fail in these specific environments,
> despite efforts to reproduce them in similar environments elsewhere.
> 
> HTH,
> andrew


-- 
Ole Solberg, Database Technology Group,
Sun Microsystems, Trondheim, Norway