You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@river.apache.org by Peter Firmstone <ji...@zeus.net.au> on 2011/12/10 04:28:09 UTC

Implications for Security Checks - SocketPermission, URL and DNS lookups

DNS lookups and reverse lookups caused by URL and SocketPermission, 
equals, hashCode and implies methods create some serious performance 
problems for distributed programs.

The concurrent policy implementation I've been working on reduces lock 
contention between threads performing security checks.

When the SecurityManager is used to check a guard, it calls the 
AccessController, which retrieves the AccessControlContext from the call 
stack, this contains all the ProtectionDomain's on the call stack (I 
won't go into privileged calls here), if a ProtectionDomain is dynamic 
it will consult the Policy, prior to checking the static permissions it 
contains.

The problem with the old policy implementation is lock contention caused 
by multiple threads all using multiple ProtectionDomains, when the time 
taken to perform a check is considerable, especially where identical 
security checks might be performed by multiple threads executing the 
same code.

Although concurrent policy reduces contention between ProtectionDomain's 
calls to Policy.implies, there remain some fundamental problems with the 
implementations of SocketPermission and URL, that cause unnecessary DNS 
lookups during equals(), hashCode() and implies() methods.

The following bugs concern SocketPermission (please read before 
continuing) :

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6592285
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4975882 - contains a 
lot of valuable comments.
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4671007 - fixed, 
perhaps incorrectly.
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6501746

Anyway to cut a long story short, DNS lookups and DNS reverse lookups 
are performed for the equals and hashCode implementations in 
SocketPermission and URL, with disastrous performance implications for 
policy implementations using collections and caching security permission 
check results. 

For example, once a SocketPermission guard has been checked for a 
specific AccessContolContext the result is cached by my SecurityManager, 
avoiding repeat security checks, however if that cache contains 
SocketPermission, DNS lookups will be required, the cache will perform 
slower than some other directly performed security checks!  The cache is 
intended to return quickly to avoid reconsulting every ProtectionDomain 
on the stack.

To make matters worse, when checking a SocketPermission guard, the DNS 
may be consulted for every non wild card SocketPermission contained 
within a SocketPermissionCollection, up until it is implied.  DNS checks 
are being made unnecessarily, since the wild card that matches may not 
require a DNS lookup at all, but because the non matching 
SocketPermission's are being checked first, the DNS lookups and reverse 
lookups are still performed.  This could be fixed completely, by moving 
the responsibility of DNS lookups from SocketPermission to 
SocketPermissionCollection.

The identity of two SocketPermission's are equal if they resolve to the 
same IP address, but their hashCode's are different! See bug 6592623.

The identity of a SocketPermission with an IP address and a DNS name, 
resolving to identical IP address should not (in my opinion) be equal, 
but is!  One SocketPermission should only imply the other while DNS 
resolves to the same IP address, otherwise the equality of the two 
SocketPermission's will change if the IP address is assigned to a 
different domain!  Object equality / identity shouldn't depend on the 
result of a possibly unreliable network source.

SocketPermission and SocketPermissionCollection are broken, the only 
solution I can think of is to re-implement these classes (from Harmony) 
in the policy and SecurityManager, substituting the existing jvm 
classes.  This would not be visible to client developers.

SocketPermission's may also exist in a ProtectionDomain's static 
Permissions, these would have to be converted by the policy when merging 
the permissions from the ProtectionDomain with those from the policy.  
Since ProtectionDomain, attempts to check it's own internal permissions, 
after the policy permission check fails, DNS checks are currently 
performed by duplicate SocketPermission's residing in the 
ProectionDomain, this will no longer occur, since the permission being 
checked will be converted to say for argument sake 
org.apache.river.security.SocketPermission.  However because some 
ProtectionDomains are static, they never consult the policy, so the 
Permission's contained in each ProtectionDomain will require conversion 
also, to do so will require extending and implementing a 
ProtectionDomain that encapsulates existing ProtectionDomain's in the 
AccessControlContext, by utilising a DomainCombiner.

For CodeSource grant's, the policy file based grant's are defined by 
URL's, however URL's identity depend upon DNS record results, similar to 
SocketPermission equals and hashCode implementations which we have no 
control over.

I'm thinking about implementing URI based grant's instead, to avoid DNS 
lookups, then allowing a policy compatibility mode to be enabled (with 
logging) for falling back to CodeSource grant's when a URL cannot be 
converted to a URI, this is a much simpler fix than the SocketPermission 
problem.

For Dynamic Policy Grants, because ProtectionDomain doesn't override 
equals (that's a good thing), the contained CodeSource must also be 
checked, again potentially slowing down permission checks with DNS 
lookups, simply because CodeSource uses URL's.  Changing the Dynamic 
Grant's to use URI based comparison would be relatively simple, since 
the URI is obtained dynamically when the dynamic grant is created.

URI based grant's don't use DNS resolution and would have a narrower 
scope of implied CodeSources, an IP based grant won't imply a DNS domain 
URL based CodeSource and vice versa.  Rather than rely on DNS 
resolution, grant's could be made specifically for IPv4, IPv6 and DNS 
names in policy files.  URL.toURI() can be utilised to check if URI 
grant's imply a CodeSource without resorting to DNS.

Any thoughts, comments or ideas?

N.B. It's sad that security is implemented the way it is, it would be 
far better if it was Executor based, since every protection domain could 
be checked in parallel, rather than in sequence.

Regards,

Peter.



Re: Implications for Security Checks - SocketPermission, URL and DNS lookups

Posted by Peter <ji...@zeus.net.au>.
Actually, my last comment about executor based parallel protection domain security checks, it is possible (in spite of AccessControlContext being declared final), by using a domain combiner to merge all protectiondomain's on the stack into callable tasks and returning one ProtectionDomain that contains all these tasks, when implies is called, it submits all tasks to an executor, the first task to return false causes the Protectiondomain's implies method to return false, while the rest are cancelled using the thread interrupt status.  The policy could check the interrupt status at execution check points and return early if interrupted.

Since a policy is mostly read, occasional write, this could speed execution considerably, especially when permission checks require network or file access and there are a number of ProtectionDomain's on the stack.

Cheers,

Peter.

----- Original message -----
> DNS lookups and reverse lookups caused by URL and SocketPermission,
> equals, hashCode and implies methods create some serious performance
> problems for distributed programs.
>
> The concurrent policy implementation I've been working on reduces lock
> contention between threads performing security checks.
>
> When the SecurityManager is used to check a guard, it calls the
> AccessController, which retrieves the AccessControlContext from the call
> stack, this contains all the ProtectionDomain's on the call stack (I
> won't go into privileged calls here), if a ProtectionDomain is dynamic
> it will consult the Policy, prior to checking the static permissions it
> contains.
>
> The problem with the old policy implementation is lock contention caused
> by multiple threads all using multiple ProtectionDomains, when the time
> taken to perform a check is considerable, especially where identical
> security checks might be performed by multiple threads executing the
> same code.
>
> Although concurrent policy reduces contention between ProtectionDomain's
> calls to Policy.implies, there remain some fundamental problems with the
> implementations of SocketPermission and URL, that cause unnecessary DNS
> lookups during equals(), hashCode() and implies() methods.
>
> The following bugs concern SocketPermission (please read before
> continuing) :
>
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6592285
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4975882 - contains a
> lot of valuable comments.
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4671007 - fixed,
> perhaps incorrectly.
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6501746
>
> Anyway to cut a long story short, DNS lookups and DNS reverse lookups
> are performed for the equals and hashCode implementations in
> SocketPermission and URL, with disastrous performance implications for
> policy implementations using collections and caching security permission
> check results.
>
> For example, once a SocketPermission guard has been checked for a
> specific AccessContolContext the result is cached by my SecurityManager,
> avoiding repeat security checks, however if that cache contains
> SocketPermission, DNS lookups will be required, the cache will perform
> slower than some other directly performed security checks!  The cache is
> intended to return quickly to avoid reconsulting every ProtectionDomain
> on the stack.
>
> To make matters worse, when checking a SocketPermission guard, the DNS
> may be consulted for every non wild card SocketPermission contained
> within a SocketPermissionCollection, up until it is implied.  DNS checks
> are being made unnecessarily, since the wild card that matches may not
> require a DNS lookup at all, but because the non matching
> SocketPermission's are being checked first, the DNS lookups and reverse
> lookups are still performed.  This could be fixed completely, by moving
> the responsibility of DNS lookups from SocketPermission to
> SocketPermissionCollection.
>
> The identity of two SocketPermission's are equal if they resolve to the
> same IP address, but their hashCode's are different! See bug 6592623.
>
> The identity of a SocketPermission with an IP address and a DNS name,
> resolving to identical IP address should not (in my opinion) be equal,
> but is!  One SocketPermission should only imply the other while DNS
> resolves to the same IP address, otherwise the equality of the two
> SocketPermission's will change if the IP address is assigned to a
> different domain!  Object equality / identity shouldn't depend on the
> result of a possibly unreliable network source.
>
> SocketPermission and SocketPermissionCollection are broken, the only
> solution I can think of is to re-implement these classes (from Harmony)
> in the policy and SecurityManager, substituting the existing jvm
> classes.  This would not be visible to client developers.
>
> SocketPermission's may also exist in a ProtectionDomain's static
> Permissions, these would have to be converted by the policy when merging
> the permissions from the ProtectionDomain with those from the policy. 
> Since ProtectionDomain, attempts to check it's own internal permissions,
> after the policy permission check fails, DNS checks are currently
> performed by duplicate SocketPermission's residing in the
> ProectionDomain, this will no longer occur, since the permission being
> checked will be converted to say for argument sake
> org.apache.river.security.SocketPermission.  However because some
> ProtectionDomains are static, they never consult the policy, so the
> Permission's contained in each ProtectionDomain will require conversion
> also, to do so will require extending and implementing a
> ProtectionDomain that encapsulates existing ProtectionDomain's in the
> AccessControlContext, by utilising a DomainCombiner.
>
> For CodeSource grant's, the policy file based grant's are defined by
> URL's, however URL's identity depend upon DNS record results, similar to
> SocketPermission equals and hashCode implementations which we have no
> control over.
>
> I'm thinking about implementing URI based grant's instead, to avoid DNS
> lookups, then allowing a policy compatibility mode to be enabled (with
> logging) for falling back to CodeSource grant's when a URL cannot be
> converted to a URI, this is a much simpler fix than the SocketPermission
> problem.
>
> For Dynamic Policy Grants, because ProtectionDomain doesn't override
> equals (that's a good thing), the contained CodeSource must also be
> checked, again potentially slowing down permission checks with DNS
> lookups, simply because CodeSource uses URL's.  Changing the Dynamic
> Grant's to use URI based comparison would be relatively simple, since
> the URI is obtained dynamically when the dynamic grant is created.
>
> URI based grant's don't use DNS resolution and would have a narrower
> scope of implied CodeSources, an IP based grant won't imply a DNS domain
> URL based CodeSource and vice versa.  Rather than rely on DNS
> resolution, grant's could be made specifically for IPv4, IPv6 and DNS
> names in policy files.  URL.toURI() can be utilised to check if URI
> grant's imply a CodeSource without resorting to DNS.
>
> Any thoughts, comments or ideas?
>
> N.B. It's sad that security is implemented the way it is, it would be
> far better if it was Executor based, since every protection domain could
> be checked in parallel, rather than in sequence.
>
> Regards,
>
> Peter.
>
>


Re: Implications for Security Checks - SocketPermission, URL and DNS lookups

Posted by Gregg Wonderly <gr...@wonderly.org>.
Yes, that is what I am referring too, and it turns out, that for "lookup 
servers", it provides a natural throttling mechanism for the OS so that 
applications and the machine are not overloaded by huge amounts of traffic.

What I was proposing, was looking at making Reggie perform this kind of 
"limiting" itself with tunable parameters, or at least some sense of natural 
progression of resource reduction.  This would allow things to "continue" to 
work, but at a reduced rate, so that the behavior would not deteriorate to 
not-working and result in the kinds of experiences that you and others have 
experienced.

Gregg

On 12/13/2011 9:04 AM, Christopher Dolan wrote:
> I think you're referring to this: http://support.microsoft.com/kb/314882 ("Inbound connections limit in Windows XP"). If so, that applies only to WinXP. I understood that Microsoft relaxed that restriction for Vista and later. As you say it did not apply to the server OS, specifically Win 2003.
>
> So, I wouldn't bother with a specific Reggie patch for this issue, as it will be less and less important as time progresses.
>
> Chris
>
> -----Original Message-----
> From: Gregg Wonderly [mailto:gregg@wonderly.org]
> Sent: Tuesday, December 13, 2011 8:56 AM
> To: dev@river.apache.org
> Subject: Re: Implications for Security Checks - SocketPermission, URL and DNS lookups
>
> Also, one simple reminder about "Windows".  The folks at Microsoft want to be
> able to make you buy server class OSes, so the user OSes limit the number of
> simultaneous socket connections as well as other things, so that you can't buy a
> cheap "user" seat and make a "server" of any substance out of it.   But, when
> you put a Jini LUS instance, such as Jini on a "user" seat machine, these
> limitations can "help" control overload.  What happens, is that Windows will
> throw out "RST" packets when too many connections occur, and cause the
> connecting machines to back off.
>
> I don't have specific numbers to show, but practically, it will cause a few
> machines at a time to register, and others to retry later when the next
> multicast announcement goes out.
>
>   From some perspectives, we might want to look at providing a "setting" for
> reggie which would cause it to limit the total number of inbound registrations
> and lookups in a way which would provide for some good old fashioned resource
> management that worked well to keep what Chris mentions here from happening.
>
> Gregg Wonderly
>
> On 12/13/2011 8:31 AM, Christopher Dolan wrote:
>> Quite true Gregg, but that doesn't help when Reggie boots and hundreds of hosts contact it in a short time span against a cold DNS cache. Prior to resolution of RIVER-396 ("PreferredClassProvider classloader cache concurrency improvement") these timeout failures were effectively serial and caused long stalls. The resulting OOMEs and failed thread creation events in some isolated scenarios were unrecoverable. For me, this was mitigated by the triple solution of 1) turning off the SocketPermission check, 2) the RIVER-396 patch and 3) switching JERI to NIO to save some threads.
>>
>> Chris
>>
>> -----Original Message-----
>> From: Gregg Wonderly [mailto:gregg@wonderly.org]
>> Sent: Tuesday, December 13, 2011 8:19 AM
>> To: dev@river.apache.org
>> Cc: Peter Firmstone
>> Subject: Re: Implications for Security Checks - SocketPermission, URL and DNS lookups
>>
>> Remember to, from a general "workaround" perspective, that you can use command
>> line options to "lengthen" the time that DNS failure information is retained, to
>> keep things moving when no reverse DNS information is available.  The default,
>> is like 10 seconds, and that is considerably shorter than what you will
>> generally experience in a failed lookup.  The end result, is that the failure
>> cache doesn't serve much purpose without it having a very extended time, as a
>> workaround.   In some cases, I've set it to an hour or more, and some initial
>> startup is then "slow", and initial client "connection" can be a little slow,
>> but then things move along quite well.
>>
>> Gregg Wonderly
>>
>> On 12/13/2011 2:56 AM, Peter Firmstone wrote:
>>> In addition CodeSource.implies() also causes DNS checks, I'm not 100% sure
>>> about the jvm code, but Harmony code uses SocketPermission.implies() to check
>>> if one CodeSource implies another, I believe the jvm policy implementation
>>> also utilises it, because harmony's implementation is built from Sun's java spec.
>>>
>>> So in the existing policy implementations, when parsing the policy files,
>>> additional start up delays may be caused by the CodeSource.implies() method
>>> making network DNS calls.
>>>
>>> In my ConcurrentPolicyFile implementation (to replace the standard java
>>> PolicyFile implementation), I've created a URIGrant, I've taken code from
>>> Harmony to implement implies(ProtectionDomain pd), that performs wildcard
>>> matching compliant with CodeSource.implies, the only difference being, that no
>>> attempt to resolve URI's is made.
>>>
>>> Typically most policy files specify file based URL's for CodeSource, however
>>> in a network application where many CodeSources may be network URL's, DNS
>>> lookup causes added delays.
>>>
>>> I've also created a CodeSourceGrant which uses CodeSource.implies() for
>>> backward compatibility with existing java policy files, however I'm sure that
>>> most will simply want to revise their policy files.
>>>
>>> The standard interface PermissionGrant, is implemented by the following
>>> inheritance hierarchy of immutable classes:
>>>
>>>                                    PrincipalGrant
>>>                    ______________|_______________________________
>>>
>>> |
>>> |
>>> ProtectionDomainGrant
>>> CertificateGrant
>>>                   |
>>> ________________ |________________
>>> ClassLoaderGrant
>>> |                                                                  |
>>>
>>> URIGrant                                              CodeSourceGrant
>>>
>>>
>>> Only PrincipalGrant is publicly visible, a builder returns the correct
>>> implementation.
>>>
>>> ProtectionDomainGrant and ClassLoaderGrant are dynamically granted, by the
>>> completely new DynamicPolicyProvider (which has long since passed all tests).
>>>
>>> CertificateGrant, URIGrant and CodeSourceGrant are used by the File based
>>> policy's and RemotePolicy, which is intended to be a service that nodes in a
>>> djinn can use to allow an administrator to update the policy (eg to include
>>> new certificates or principals), with all the protection of subject
>>> authentication and secure connections.  RemotePolicy is idempotent, the policy
>>> is updated in one operation, so the current policy state is always known to
>>> the administrator (who is a client).
>>>
>>> Since a File based policy is mostly read and only written when refreshed,
>>> PermissionGrant's are held in a volatile array reference, copied (only the
>>> reference) by any code that reads the array.  The array reference is updated
>>> when the policy is updated, the array is never mutated after publishing.
>>>
>>> A ConcurrentMap<ProtectionDomain, PermissionCollection>   (with weak keys) acts
>>> as a cache, I've got ConcurrentPermissions, an implementation that replaces
>>> the hetergenous java.security.Permissions class, this also resolves any
>>> unresolved permissions.
>>>
>>> However I'm starting to wonder if it's wiser to throw away the cache
>>> altogether and simply build java.security.Permissions on demand, then throw
>>> Permissions away immediately after use for collection in the young generation
>>> heap (it's likely to fit in level 2 cache and never even be copied to Ram).
>>> This would eliminate contention between existing PermissionCollection's that
>>> block, like SocketPermissionCollection.
>>>
>>> So if you have for instance 100 different AccessControlContext's being checked
>>> by different threads, that all contain the same ProtectionDomain's for a
>>> SocketPermission, then all will be executed in parallel.  Currently due to
>>> blocking, each SocketPermission that performs a DNS check must either resolve
>>> or timeout, before it's SocketPermissionCollection can release it's
>>> synchronization lock (and there may be multiple SocketPermission's in a
>>> SocketPermissionCollection), before another thread can check it's context and
>>> so on, which explains everything coming to a standstill.
>>>
>>> If all permission checks execute in parallel independently, without blocking,
>>> then the timeout won't be magnified.
>>>
>>> I am considering going one step further and replacing SocketPermission and
>>> SocketPermissionCollection, and implementing DNS checks in the
>>> SocketPermissionCollection rather than SocketPermission.  By doing this a
>>> matching record will be found in most cases without requiring DNS reverse
>>> lookup.  If I keep this as an internal policy implementation detail, then if
>>> Oracle fixes SocketPermission, we can return to using the standard java
>>> implementation, in fact I could make it a configuration property.
>>>
>>> It's an unfortunate fact that not all permission checks are performed in the
>>> policy, replacing SocketPermission also requires the cooperation of the
>>> SecurityManager.  To make matters worse, static ProtectionDomains created
>>> prior to my policy implementation being constructed will never consult my
>>> policy implementation as such they will still contain SocketPermission.   So
>>> the SecurityManager would need to check each ProtectionDomain for both
>>> implementations, so reimplementing SocketPermission doesn't eliminate its use
>>> entirely.
>>>
>>> It's worth noting that SocketPermission is implemented rather poorly and the
>>> same functionality can be provided with far fewer DNS lookups being performed,
>>> since the majority are performed completely unnecessarily.  Perhaps it's worth
>>> me donating some time to OpenJDK to fix it, I'd have to check with Apache
>>> legal first I suppose.
>>>
>>> The problems with DNS lookup also affects CodeSource and URL equals and
>>> hashcode methods, so these classes shouldn't be used in collections.
>>>
>>> Cheers,
>>>
>>> Peter.
>>>
>>> Christopher Dolan wrote:
>>>> To simulate the problem, go to InetAddress.getHostFromNameService() in your
>>>> IDE, set a breakpoint on the "nameService.getHostByAddr" line with a
>>>> condition of something like this:
>>>>
>>>>        new java.util.concurrent.CountDownLatch(1).await(15,
>>>> java.util.concurrent.TimeUnit.SECONDS)
>>>>
>>>> then launch your River application from within the IDE. This will cause all
>>>> reverse DNS lookups to stall for 15 seconds before succeeding. This will
>>>> affect Reggie the worst because it has to verify so many hostnames. In a
>>>> large group (a few thousand services) this will drive Reggie's thread count
>>>> skyward, perhaps triggering OutOfMemory errors if it's in a 32-bit JVM.
>>>>
>>>> This problem happens in the real world in facilities that allow client
>>>> connections to the production LAN, but do not allow the production LAN to
>>>> resolve hosts in the client LAN. This may occur due to separate IT teams or
>>>> strict security rules or simple configuration errors. Because most
>>>> client-server systems, like web servers, do not require the server to contact
>>>> the client this problem does not become immediately visible to IT. Instead,
>>>> the question is inevitably "Why is Jini/River so sensitive to reverse DNS?
>>>> All of my other services work fine."
>>>>
>>>> Chris
>>>>
>>>> -----Original Message-----
>>>> From: Tom Hobbs [mailto:tvhobbs@googlemail.com] Sent: Monday, December 12,
>>>> 2011 1:43 PM
>>>> To: dev@river.apache.org
>>>> Subject: Re: RE: Implications for Security Checks - SocketPermission, URL and
>>>> DNS lookups
>>>>
>>>> My biggest concern with such fundamental changes is controlling the impact
>>>> it will have.  I'm a pretty good example of this, I haven't experienced the
>>>> troubles these changes are intended to overcome.  I also don't havent made
>>>> any attempt to dive into these areas of the code, for any reason.
>>>>
>>>> Is it possible to put together a test case which exposes these problems and
>>>> also proves the solution?
>>>>
>>>> Obviously, a test case involving misconfigured networks is daft, in that
>>>> instance a handy "if your network misconfigured" diagnostic tool or
>>>> documentation would be a good idea.
>>>>
>>>> Please don't interpret this concern as a criticism of your work, Peter.
>>>> Far from it.  It's just a comment born out of not really having any contact
>>>> with the area your working in!
>>>>
>>>>
>>>> Grammar and spelling have been sacrificed on the altar of messaging via
>>>> mobile device.
>>>>
>>>> On 12 Dec 2011 18:01, "Christopher Dolan"<ch...@avid.com>
>>>> wrote:
>>>>
>>>>> Specifically for SocketPermission, I experienced severe timeout problems
>>>>> with reverse DNS misconfigurations. For some LAN-based deployments, I
>>>>> relaxed this criterion via 'new SocketPermission("*",
>>>>> "accept,listen,connect,resolve")'. This was difficult to apply to a general
>>>>> Sun/Oracle JVM, however, because the default security policy *prepends* a
>>>>> ("localhost:1024-","listen") permission that triggers the reverse DNS
>>>>> lookup. To avoid this inconvenient setting, I install a new
>>>>> java.security.Policy subclass that delegates to the default Policy except
>>>>> when the incoming permission is a SocketPermission. That way I don't need
>>>>> to modify the policy file in the JVM. The Policy.implies() override method
>>>>> is trivial because it just needs to do " if (permission instanceof
>>>>> SocketPermission) { ... }". The PermissionCollection methods were trickier
>>>>> to override (skip over any SocketPermission elements in the default
>>>>> Policy's PermissionCollection), but still only about 50 LOC.
>>>>>
>>>>> Chris
>>>>>
>>>>> -----Original Message-----
>>>>> From: Peter Firmstone [mailto:jini@zeus.net.au]
>>>>> Sent: Friday, December 09, 2011 9:28 PM
>>>>> To: dev@river.apache.org
>>>>> Subject: Implications for Security Checks - SocketPermission, URL and DNS
>>>>> lookups
>>>>>
>>>>> DNS lookups and reverse lookups caused by URL and SocketPermission,
>>>>> equals, hashCode and implies methods create some serious performance
>>>>> problems for distributed programs.
>>>>>
>>>>> The concurrent policy implementation I've been working on reduces lock
>>>>> contention between threads performing security checks.
>>>>>
>>>>> When the SecurityManager is used to check a guard, it calls the
>>>>> AccessController, which retrieves the AccessControlContext from the call
>>>>> stack, this contains all the ProtectionDomain's on the call stack (I
>>>>> won't go into privileged calls here), if a ProtectionDomain is dynamic
>>>>> it will consult the Policy, prior to checking the static permissions it
>>>>> contains.
>>>>>
>>>>> The problem with the old policy implementation is lock contention caused
>>>>> by multiple threads all using multiple ProtectionDomains, when the time
>>>>> taken to perform a check is considerable, especially where identical
>>>>> security checks might be performed by multiple threads executing the
>>>>> same code.
>>>>>
>>>>> Although concurrent policy reduces contention between ProtectionDomain's
>>>>> calls to Policy.implies, there remain some fundamental problems with the
>>>>> implementations of SocketPermission and URL, that cause unnecessary DNS
>>>>> lookups during equals(), hashCode() and implies() methods.
>>>>>
>>>>> The following bugs concern SocketPermission (please read before
>>>>> continuing) :
>>>>>
>>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6592285
>>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4975882 - contains a
>>>>> lot of valuable comments.
>>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4671007 - fixed,
>>>>> perhaps incorrectly.
>>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6501746
>>>>>
>>>>> Anyway to cut a long story short, DNS lookups and DNS reverse lookups
>>>>> are performed for the equals and hashCode implementations in
>>>>> SocketPermission and URL, with disastrous performance implications for
>>>>> policy implementations using collections and caching security permission
>>>>> check results.
>>>>>
>>>>> For example, once a SocketPermission guard has been checked for a
>>>>> specific AccessContolContext the result is cached by my SecurityManager,
>>>>> avoiding repeat security checks, however if that cache contains
>>>>> SocketPermission, DNS lookups will be required, the cache will perform
>>>>> slower than some other directly performed security checks!  The cache is
>>>>> intended to return quickly to avoid reconsulting every ProtectionDomain
>>>>> on the stack.
>>>>>
>>>>> To make matters worse, when checking a SocketPermission guard, the DNS
>>>>> may be consulted for every non wild card SocketPermission contained
>>>>> within a SocketPermissionCollection, up until it is implied.  DNS checks
>>>>> are being made unnecessarily, since the wild card that matches may not
>>>>> require a DNS lookup at all, but because the non matching
>>>>> SocketPermission's are being checked first, the DNS lookups and reverse
>>>>> lookups are still performed.  This could be fixed completely, by moving
>>>>> the responsibility of DNS lookups from SocketPermission to
>>>>> SocketPermissionCollection.
>>>>>
>>>>> The identity of two SocketPermission's are equal if they resolve to the
>>>>> same IP address, but their hashCode's are different! See bug 6592623.
>>>>>
>>>>> The identity of a SocketPermission with an IP address and a DNS name,
>>>>> resolving to identical IP address should not (in my opinion) be equal,
>>>>> but is!  One SocketPermission should only imply the other while DNS
>>>>> resolves to the same IP address, otherwise the equality of the two
>>>>> SocketPermission's will change if the IP address is assigned to a
>>>>> different domain!  Object equality / identity shouldn't depend on the
>>>>> result of a possibly unreliable network source.
>>>>>
>>>>> SocketPermission and SocketPermissionCollection are broken, the only
>>>>> solution I can think of is to re-implement these classes (from Harmony)
>>>>> in the policy and SecurityManager, substituting the existing jvm
>>>>> classes.  This would not be visible to client developers.
>>>>>
>>>>> SocketPermission's may also exist in a ProtectionDomain's static
>>>>> Permissions, these would have to be converted by the policy when merging
>>>>> the permissions from the ProtectionDomain with those from the policy.
>>>>> Since ProtectionDomain, attempts to check it's own internal permissions,
>>>>> after the policy permission check fails, DNS checks are currently
>>>>> performed by duplicate SocketPermission's residing in the
>>>>> ProectionDomain, this will no longer occur, since the permission being
>>>>> checked will be converted to say for argument sake
>>>>> org.apache.river.security.SocketPermission.  However because some
>>>>> ProtectionDomains are static, they never consult the policy, so the
>>>>> Permission's contained in each ProtectionDomain will require conversion
>>>>> also, to do so will require extending and implementing a
>>>>> ProtectionDomain that encapsulates existing ProtectionDomain's in the
>>>>> AccessControlContext, by utilising a DomainCombiner.
>>>>>
>>>>> For CodeSource grant's, the policy file based grant's are defined by
>>>>> URL's, however URL's identity depend upon DNS record results, similar to
>>>>> SocketPermission equals and hashCode implementations which we have no
>>>>> control over.
>>>>>
>>>>> I'm thinking about implementing URI based grant's instead, to avoid DNS
>>>>> lookups, then allowing a policy compatibility mode to be enabled (with
>>>>> logging) for falling back to CodeSource grant's when a URL cannot be
>>>>> converted to a URI, this is a much simpler fix than the SocketPermission
>>>>> problem.
>>>>>
>>>>> For Dynamic Policy Grants, because ProtectionDomain doesn't override
>>>>> equals (that's a good thing), the contained CodeSource must also be
>>>>> checked, again potentially slowing down permission checks with DNS
>>>>> lookups, simply because CodeSource uses URL's.  Changing the Dynamic
>>>>> Grant's to use URI based comparison would be relatively simple, since
>>>>> the URI is obtained dynamically when the dynamic grant is created.
>>>>>
>>>>> URI based grant's don't use DNS resolution and would have a narrower
>>>>> scope of implied CodeSources, an IP based grant won't imply a DNS domain
>>>>> URL based CodeSource and vice versa.  Rather than rely on DNS
>>>>> resolution, grant's could be made specifically for IPv4, IPv6 and DNS
>>>>> names in policy files.  URL.toURI() can be utilised to check if URI
>>>>> grant's imply a CodeSource without resorting to DNS.
>>>>>
>>>>> Any thoughts, comments or ideas?
>>>>>
>>>>> N.B. It's sad that security is implemented the way it is, it would be
>>>>> far better if it was Executor based, since every protection domain could
>>>>> be checked in parallel, rather than in sequence.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Peter.
>>>>>
>>>>>
>>>>>
>


RE: Implications for Security Checks - SocketPermission, URL and DNS lookups

Posted by Christopher Dolan <ch...@avid.com>.
I think you're referring to this: http://support.microsoft.com/kb/314882 ("Inbound connections limit in Windows XP"). If so, that applies only to WinXP. I understood that Microsoft relaxed that restriction for Vista and later. As you say it did not apply to the server OS, specifically Win 2003.

So, I wouldn't bother with a specific Reggie patch for this issue, as it will be less and less important as time progresses.

Chris

-----Original Message-----
From: Gregg Wonderly [mailto:gregg@wonderly.org] 
Sent: Tuesday, December 13, 2011 8:56 AM
To: dev@river.apache.org
Subject: Re: Implications for Security Checks - SocketPermission, URL and DNS lookups

Also, one simple reminder about "Windows".  The folks at Microsoft want to be 
able to make you buy server class OSes, so the user OSes limit the number of 
simultaneous socket connections as well as other things, so that you can't buy a 
cheap "user" seat and make a "server" of any substance out of it.   But, when 
you put a Jini LUS instance, such as Jini on a "user" seat machine, these 
limitations can "help" control overload.  What happens, is that Windows will 
throw out "RST" packets when too many connections occur, and cause the 
connecting machines to back off.

I don't have specific numbers to show, but practically, it will cause a few 
machines at a time to register, and others to retry later when the next 
multicast announcement goes out.

 From some perspectives, we might want to look at providing a "setting" for 
reggie which would cause it to limit the total number of inbound registrations 
and lookups in a way which would provide for some good old fashioned resource 
management that worked well to keep what Chris mentions here from happening.

Gregg Wonderly

On 12/13/2011 8:31 AM, Christopher Dolan wrote:
> Quite true Gregg, but that doesn't help when Reggie boots and hundreds of hosts contact it in a short time span against a cold DNS cache. Prior to resolution of RIVER-396 ("PreferredClassProvider classloader cache concurrency improvement") these timeout failures were effectively serial and caused long stalls. The resulting OOMEs and failed thread creation events in some isolated scenarios were unrecoverable. For me, this was mitigated by the triple solution of 1) turning off the SocketPermission check, 2) the RIVER-396 patch and 3) switching JERI to NIO to save some threads.
>
> Chris
>
> -----Original Message-----
> From: Gregg Wonderly [mailto:gregg@wonderly.org]
> Sent: Tuesday, December 13, 2011 8:19 AM
> To: dev@river.apache.org
> Cc: Peter Firmstone
> Subject: Re: Implications for Security Checks - SocketPermission, URL and DNS lookups
>
> Remember to, from a general "workaround" perspective, that you can use command
> line options to "lengthen" the time that DNS failure information is retained, to
> keep things moving when no reverse DNS information is available.  The default,
> is like 10 seconds, and that is considerably shorter than what you will
> generally experience in a failed lookup.  The end result, is that the failure
> cache doesn't serve much purpose without it having a very extended time, as a
> workaround.   In some cases, I've set it to an hour or more, and some initial
> startup is then "slow", and initial client "connection" can be a little slow,
> but then things move along quite well.
>
> Gregg Wonderly
>
> On 12/13/2011 2:56 AM, Peter Firmstone wrote:
>> In addition CodeSource.implies() also causes DNS checks, I'm not 100% sure
>> about the jvm code, but Harmony code uses SocketPermission.implies() to check
>> if one CodeSource implies another, I believe the jvm policy implementation
>> also utilises it, because harmony's implementation is built from Sun's java spec.
>>
>> So in the existing policy implementations, when parsing the policy files,
>> additional start up delays may be caused by the CodeSource.implies() method
>> making network DNS calls.
>>
>> In my ConcurrentPolicyFile implementation (to replace the standard java
>> PolicyFile implementation), I've created a URIGrant, I've taken code from
>> Harmony to implement implies(ProtectionDomain pd), that performs wildcard
>> matching compliant with CodeSource.implies, the only difference being, that no
>> attempt to resolve URI's is made.
>>
>> Typically most policy files specify file based URL's for CodeSource, however
>> in a network application where many CodeSources may be network URL's, DNS
>> lookup causes added delays.
>>
>> I've also created a CodeSourceGrant which uses CodeSource.implies() for
>> backward compatibility with existing java policy files, however I'm sure that
>> most will simply want to revise their policy files.
>>
>> The standard interface PermissionGrant, is implemented by the following
>> inheritance hierarchy of immutable classes:
>>
>>                                   PrincipalGrant
>>                   ______________|_______________________________
>>
>> |
>> |
>> ProtectionDomainGrant
>> CertificateGrant
>>                  |
>> ________________ |________________
>> ClassLoaderGrant
>> |                                                                  |
>>
>> URIGrant                                              CodeSourceGrant
>>
>>
>> Only PrincipalGrant is publicly visible, a builder returns the correct
>> implementation.
>>
>> ProtectionDomainGrant and ClassLoaderGrant are dynamically granted, by the
>> completely new DynamicPolicyProvider (which has long since passed all tests).
>>
>> CertificateGrant, URIGrant and CodeSourceGrant are used by the File based
>> policy's and RemotePolicy, which is intended to be a service that nodes in a
>> djinn can use to allow an administrator to update the policy (eg to include
>> new certificates or principals), with all the protection of subject
>> authentication and secure connections.  RemotePolicy is idempotent, the policy
>> is updated in one operation, so the current policy state is always known to
>> the administrator (who is a client).
>>
>> Since a File based policy is mostly read and only written when refreshed,
>> PermissionGrant's are held in a volatile array reference, copied (only the
>> reference) by any code that reads the array.  The array reference is updated
>> when the policy is updated, the array is never mutated after publishing.
>>
>> A ConcurrentMap<ProtectionDomain, PermissionCollection>  (with weak keys) acts
>> as a cache, I've got ConcurrentPermissions, an implementation that replaces
>> the hetergenous java.security.Permissions class, this also resolves any
>> unresolved permissions.
>>
>> However I'm starting to wonder if it's wiser to throw away the cache
>> altogether and simply build java.security.Permissions on demand, then throw
>> Permissions away immediately after use for collection in the young generation
>> heap (it's likely to fit in level 2 cache and never even be copied to Ram).
>> This would eliminate contention between existing PermissionCollection's that
>> block, like SocketPermissionCollection.
>>
>> So if you have for instance 100 different AccessControlContext's being checked
>> by different threads, that all contain the same ProtectionDomain's for a
>> SocketPermission, then all will be executed in parallel.  Currently due to
>> blocking, each SocketPermission that performs a DNS check must either resolve
>> or timeout, before it's SocketPermissionCollection can release it's
>> synchronization lock (and there may be multiple SocketPermission's in a
>> SocketPermissionCollection), before another thread can check it's context and
>> so on, which explains everything coming to a standstill.
>>
>> If all permission checks execute in parallel independently, without blocking,
>> then the timeout won't be magnified.
>>
>> I am considering going one step further and replacing SocketPermission and
>> SocketPermissionCollection, and implementing DNS checks in the
>> SocketPermissionCollection rather than SocketPermission.  By doing this a
>> matching record will be found in most cases without requiring DNS reverse
>> lookup.  If I keep this as an internal policy implementation detail, then if
>> Oracle fixes SocketPermission, we can return to using the standard java
>> implementation, in fact I could make it a configuration property.
>>
>> It's an unfortunate fact that not all permission checks are performed in the
>> policy, replacing SocketPermission also requires the cooperation of the
>> SecurityManager.  To make matters worse, static ProtectionDomains created
>> prior to my policy implementation being constructed will never consult my
>> policy implementation as such they will still contain SocketPermission.   So
>> the SecurityManager would need to check each ProtectionDomain for both
>> implementations, so reimplementing SocketPermission doesn't eliminate its use
>> entirely.
>>
>> It's worth noting that SocketPermission is implemented rather poorly and the
>> same functionality can be provided with far fewer DNS lookups being performed,
>> since the majority are performed completely unnecessarily.  Perhaps it's worth
>> me donating some time to OpenJDK to fix it, I'd have to check with Apache
>> legal first I suppose.
>>
>> The problems with DNS lookup also affects CodeSource and URL equals and
>> hashcode methods, so these classes shouldn't be used in collections.
>>
>> Cheers,
>>
>> Peter.
>>
>> Christopher Dolan wrote:
>>> To simulate the problem, go to InetAddress.getHostFromNameService() in your
>>> IDE, set a breakpoint on the "nameService.getHostByAddr" line with a
>>> condition of something like this:
>>>
>>>       new java.util.concurrent.CountDownLatch(1).await(15,
>>> java.util.concurrent.TimeUnit.SECONDS)
>>>
>>> then launch your River application from within the IDE. This will cause all
>>> reverse DNS lookups to stall for 15 seconds before succeeding. This will
>>> affect Reggie the worst because it has to verify so many hostnames. In a
>>> large group (a few thousand services) this will drive Reggie's thread count
>>> skyward, perhaps triggering OutOfMemory errors if it's in a 32-bit JVM.
>>>
>>> This problem happens in the real world in facilities that allow client
>>> connections to the production LAN, but do not allow the production LAN to
>>> resolve hosts in the client LAN. This may occur due to separate IT teams or
>>> strict security rules or simple configuration errors. Because most
>>> client-server systems, like web servers, do not require the server to contact
>>> the client this problem does not become immediately visible to IT. Instead,
>>> the question is inevitably "Why is Jini/River so sensitive to reverse DNS?
>>> All of my other services work fine."
>>>
>>> Chris
>>>
>>> -----Original Message-----
>>> From: Tom Hobbs [mailto:tvhobbs@googlemail.com] Sent: Monday, December 12,
>>> 2011 1:43 PM
>>> To: dev@river.apache.org
>>> Subject: Re: RE: Implications for Security Checks - SocketPermission, URL and
>>> DNS lookups
>>>
>>> My biggest concern with such fundamental changes is controlling the impact
>>> it will have.  I'm a pretty good example of this, I haven't experienced the
>>> troubles these changes are intended to overcome.  I also don't havent made
>>> any attempt to dive into these areas of the code, for any reason.
>>>
>>> Is it possible to put together a test case which exposes these problems and
>>> also proves the solution?
>>>
>>> Obviously, a test case involving misconfigured networks is daft, in that
>>> instance a handy "if your network misconfigured" diagnostic tool or
>>> documentation would be a good idea.
>>>
>>> Please don't interpret this concern as a criticism of your work, Peter.
>>> Far from it.  It's just a comment born out of not really having any contact
>>> with the area your working in!
>>>
>>>
>>> Grammar and spelling have been sacrificed on the altar of messaging via
>>> mobile device.
>>>
>>> On 12 Dec 2011 18:01, "Christopher Dolan"<ch...@avid.com>
>>> wrote:
>>>
>>>> Specifically for SocketPermission, I experienced severe timeout problems
>>>> with reverse DNS misconfigurations. For some LAN-based deployments, I
>>>> relaxed this criterion via 'new SocketPermission("*",
>>>> "accept,listen,connect,resolve")'. This was difficult to apply to a general
>>>> Sun/Oracle JVM, however, because the default security policy *prepends* a
>>>> ("localhost:1024-","listen") permission that triggers the reverse DNS
>>>> lookup. To avoid this inconvenient setting, I install a new
>>>> java.security.Policy subclass that delegates to the default Policy except
>>>> when the incoming permission is a SocketPermission. That way I don't need
>>>> to modify the policy file in the JVM. The Policy.implies() override method
>>>> is trivial because it just needs to do " if (permission instanceof
>>>> SocketPermission) { ... }". The PermissionCollection methods were trickier
>>>> to override (skip over any SocketPermission elements in the default
>>>> Policy's PermissionCollection), but still only about 50 LOC.
>>>>
>>>> Chris
>>>>
>>>> -----Original Message-----
>>>> From: Peter Firmstone [mailto:jini@zeus.net.au]
>>>> Sent: Friday, December 09, 2011 9:28 PM
>>>> To: dev@river.apache.org
>>>> Subject: Implications for Security Checks - SocketPermission, URL and DNS
>>>> lookups
>>>>
>>>> DNS lookups and reverse lookups caused by URL and SocketPermission,
>>>> equals, hashCode and implies methods create some serious performance
>>>> problems for distributed programs.
>>>>
>>>> The concurrent policy implementation I've been working on reduces lock
>>>> contention between threads performing security checks.
>>>>
>>>> When the SecurityManager is used to check a guard, it calls the
>>>> AccessController, which retrieves the AccessControlContext from the call
>>>> stack, this contains all the ProtectionDomain's on the call stack (I
>>>> won't go into privileged calls here), if a ProtectionDomain is dynamic
>>>> it will consult the Policy, prior to checking the static permissions it
>>>> contains.
>>>>
>>>> The problem with the old policy implementation is lock contention caused
>>>> by multiple threads all using multiple ProtectionDomains, when the time
>>>> taken to perform a check is considerable, especially where identical
>>>> security checks might be performed by multiple threads executing the
>>>> same code.
>>>>
>>>> Although concurrent policy reduces contention between ProtectionDomain's
>>>> calls to Policy.implies, there remain some fundamental problems with the
>>>> implementations of SocketPermission and URL, that cause unnecessary DNS
>>>> lookups during equals(), hashCode() and implies() methods.
>>>>
>>>> The following bugs concern SocketPermission (please read before
>>>> continuing) :
>>>>
>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6592285
>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4975882 - contains a
>>>> lot of valuable comments.
>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4671007 - fixed,
>>>> perhaps incorrectly.
>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6501746
>>>>
>>>> Anyway to cut a long story short, DNS lookups and DNS reverse lookups
>>>> are performed for the equals and hashCode implementations in
>>>> SocketPermission and URL, with disastrous performance implications for
>>>> policy implementations using collections and caching security permission
>>>> check results.
>>>>
>>>> For example, once a SocketPermission guard has been checked for a
>>>> specific AccessContolContext the result is cached by my SecurityManager,
>>>> avoiding repeat security checks, however if that cache contains
>>>> SocketPermission, DNS lookups will be required, the cache will perform
>>>> slower than some other directly performed security checks!  The cache is
>>>> intended to return quickly to avoid reconsulting every ProtectionDomain
>>>> on the stack.
>>>>
>>>> To make matters worse, when checking a SocketPermission guard, the DNS
>>>> may be consulted for every non wild card SocketPermission contained
>>>> within a SocketPermissionCollection, up until it is implied.  DNS checks
>>>> are being made unnecessarily, since the wild card that matches may not
>>>> require a DNS lookup at all, but because the non matching
>>>> SocketPermission's are being checked first, the DNS lookups and reverse
>>>> lookups are still performed.  This could be fixed completely, by moving
>>>> the responsibility of DNS lookups from SocketPermission to
>>>> SocketPermissionCollection.
>>>>
>>>> The identity of two SocketPermission's are equal if they resolve to the
>>>> same IP address, but their hashCode's are different! See bug 6592623.
>>>>
>>>> The identity of a SocketPermission with an IP address and a DNS name,
>>>> resolving to identical IP address should not (in my opinion) be equal,
>>>> but is!  One SocketPermission should only imply the other while DNS
>>>> resolves to the same IP address, otherwise the equality of the two
>>>> SocketPermission's will change if the IP address is assigned to a
>>>> different domain!  Object equality / identity shouldn't depend on the
>>>> result of a possibly unreliable network source.
>>>>
>>>> SocketPermission and SocketPermissionCollection are broken, the only
>>>> solution I can think of is to re-implement these classes (from Harmony)
>>>> in the policy and SecurityManager, substituting the existing jvm
>>>> classes.  This would not be visible to client developers.
>>>>
>>>> SocketPermission's may also exist in a ProtectionDomain's static
>>>> Permissions, these would have to be converted by the policy when merging
>>>> the permissions from the ProtectionDomain with those from the policy.
>>>> Since ProtectionDomain, attempts to check it's own internal permissions,
>>>> after the policy permission check fails, DNS checks are currently
>>>> performed by duplicate SocketPermission's residing in the
>>>> ProectionDomain, this will no longer occur, since the permission being
>>>> checked will be converted to say for argument sake
>>>> org.apache.river.security.SocketPermission.  However because some
>>>> ProtectionDomains are static, they never consult the policy, so the
>>>> Permission's contained in each ProtectionDomain will require conversion
>>>> also, to do so will require extending and implementing a
>>>> ProtectionDomain that encapsulates existing ProtectionDomain's in the
>>>> AccessControlContext, by utilising a DomainCombiner.
>>>>
>>>> For CodeSource grant's, the policy file based grant's are defined by
>>>> URL's, however URL's identity depend upon DNS record results, similar to
>>>> SocketPermission equals and hashCode implementations which we have no
>>>> control over.
>>>>
>>>> I'm thinking about implementing URI based grant's instead, to avoid DNS
>>>> lookups, then allowing a policy compatibility mode to be enabled (with
>>>> logging) for falling back to CodeSource grant's when a URL cannot be
>>>> converted to a URI, this is a much simpler fix than the SocketPermission
>>>> problem.
>>>>
>>>> For Dynamic Policy Grants, because ProtectionDomain doesn't override
>>>> equals (that's a good thing), the contained CodeSource must also be
>>>> checked, again potentially slowing down permission checks with DNS
>>>> lookups, simply because CodeSource uses URL's.  Changing the Dynamic
>>>> Grant's to use URI based comparison would be relatively simple, since
>>>> the URI is obtained dynamically when the dynamic grant is created.
>>>>
>>>> URI based grant's don't use DNS resolution and would have a narrower
>>>> scope of implied CodeSources, an IP based grant won't imply a DNS domain
>>>> URL based CodeSource and vice versa.  Rather than rely on DNS
>>>> resolution, grant's could be made specifically for IPv4, IPv6 and DNS
>>>> names in policy files.  URL.toURI() can be utilised to check if URI
>>>> grant's imply a CodeSource without resorting to DNS.
>>>>
>>>> Any thoughts, comments or ideas?
>>>>
>>>> N.B. It's sad that security is implemented the way it is, it would be
>>>> far better if it was Executor based, since every protection domain could
>>>> be checked in parallel, rather than in sequence.
>>>>
>>>> Regards,
>>>>
>>>> Peter.
>>>>
>>>>
>>>>
>>
>


Re: Implications for Security Checks - SocketPermission, URL and DNS lookups

Posted by Gregg Wonderly <gr...@wonderly.org>.
Also, one simple reminder about "Windows".  The folks at Microsoft want to be 
able to make you buy server class OSes, so the user OSes limit the number of 
simultaneous socket connections as well as other things, so that you can't buy a 
cheap "user" seat and make a "server" of any substance out of it.   But, when 
you put a Jini LUS instance, such as Jini on a "user" seat machine, these 
limitations can "help" control overload.  What happens, is that Windows will 
throw out "RST" packets when too many connections occur, and cause the 
connecting machines to back off.

I don't have specific numbers to show, but practically, it will cause a few 
machines at a time to register, and others to retry later when the next 
multicast announcement goes out.

 From some perspectives, we might want to look at providing a "setting" for 
reggie which would cause it to limit the total number of inbound registrations 
and lookups in a way which would provide for some good old fashioned resource 
management that worked well to keep what Chris mentions here from happening.

Gregg Wonderly

On 12/13/2011 8:31 AM, Christopher Dolan wrote:
> Quite true Gregg, but that doesn't help when Reggie boots and hundreds of hosts contact it in a short time span against a cold DNS cache. Prior to resolution of RIVER-396 ("PreferredClassProvider classloader cache concurrency improvement") these timeout failures were effectively serial and caused long stalls. The resulting OOMEs and failed thread creation events in some isolated scenarios were unrecoverable. For me, this was mitigated by the triple solution of 1) turning off the SocketPermission check, 2) the RIVER-396 patch and 3) switching JERI to NIO to save some threads.
>
> Chris
>
> -----Original Message-----
> From: Gregg Wonderly [mailto:gregg@wonderly.org]
> Sent: Tuesday, December 13, 2011 8:19 AM
> To: dev@river.apache.org
> Cc: Peter Firmstone
> Subject: Re: Implications for Security Checks - SocketPermission, URL and DNS lookups
>
> Remember to, from a general "workaround" perspective, that you can use command
> line options to "lengthen" the time that DNS failure information is retained, to
> keep things moving when no reverse DNS information is available.  The default,
> is like 10 seconds, and that is considerably shorter than what you will
> generally experience in a failed lookup.  The end result, is that the failure
> cache doesn't serve much purpose without it having a very extended time, as a
> workaround.   In some cases, I've set it to an hour or more, and some initial
> startup is then "slow", and initial client "connection" can be a little slow,
> but then things move along quite well.
>
> Gregg Wonderly
>
> On 12/13/2011 2:56 AM, Peter Firmstone wrote:
>> In addition CodeSource.implies() also causes DNS checks, I'm not 100% sure
>> about the jvm code, but Harmony code uses SocketPermission.implies() to check
>> if one CodeSource implies another, I believe the jvm policy implementation
>> also utilises it, because harmony's implementation is built from Sun's java spec.
>>
>> So in the existing policy implementations, when parsing the policy files,
>> additional start up delays may be caused by the CodeSource.implies() method
>> making network DNS calls.
>>
>> In my ConcurrentPolicyFile implementation (to replace the standard java
>> PolicyFile implementation), I've created a URIGrant, I've taken code from
>> Harmony to implement implies(ProtectionDomain pd), that performs wildcard
>> matching compliant with CodeSource.implies, the only difference being, that no
>> attempt to resolve URI's is made.
>>
>> Typically most policy files specify file based URL's for CodeSource, however
>> in a network application where many CodeSources may be network URL's, DNS
>> lookup causes added delays.
>>
>> I've also created a CodeSourceGrant which uses CodeSource.implies() for
>> backward compatibility with existing java policy files, however I'm sure that
>> most will simply want to revise their policy files.
>>
>> The standard interface PermissionGrant, is implemented by the following
>> inheritance hierarchy of immutable classes:
>>
>>                                   PrincipalGrant
>>                   ______________|_______________________________
>>
>> |
>> |
>> ProtectionDomainGrant
>> CertificateGrant
>>                  |
>> ________________ |________________
>> ClassLoaderGrant
>> |                                                                  |
>>
>> URIGrant                                              CodeSourceGrant
>>
>>
>> Only PrincipalGrant is publicly visible, a builder returns the correct
>> implementation.
>>
>> ProtectionDomainGrant and ClassLoaderGrant are dynamically granted, by the
>> completely new DynamicPolicyProvider (which has long since passed all tests).
>>
>> CertificateGrant, URIGrant and CodeSourceGrant are used by the File based
>> policy's and RemotePolicy, which is intended to be a service that nodes in a
>> djinn can use to allow an administrator to update the policy (eg to include
>> new certificates or principals), with all the protection of subject
>> authentication and secure connections.  RemotePolicy is idempotent, the policy
>> is updated in one operation, so the current policy state is always known to
>> the administrator (who is a client).
>>
>> Since a File based policy is mostly read and only written when refreshed,
>> PermissionGrant's are held in a volatile array reference, copied (only the
>> reference) by any code that reads the array.  The array reference is updated
>> when the policy is updated, the array is never mutated after publishing.
>>
>> A ConcurrentMap<ProtectionDomain, PermissionCollection>  (with weak keys) acts
>> as a cache, I've got ConcurrentPermissions, an implementation that replaces
>> the hetergenous java.security.Permissions class, this also resolves any
>> unresolved permissions.
>>
>> However I'm starting to wonder if it's wiser to throw away the cache
>> altogether and simply build java.security.Permissions on demand, then throw
>> Permissions away immediately after use for collection in the young generation
>> heap (it's likely to fit in level 2 cache and never even be copied to Ram).
>> This would eliminate contention between existing PermissionCollection's that
>> block, like SocketPermissionCollection.
>>
>> So if you have for instance 100 different AccessControlContext's being checked
>> by different threads, that all contain the same ProtectionDomain's for a
>> SocketPermission, then all will be executed in parallel.  Currently due to
>> blocking, each SocketPermission that performs a DNS check must either resolve
>> or timeout, before it's SocketPermissionCollection can release it's
>> synchronization lock (and there may be multiple SocketPermission's in a
>> SocketPermissionCollection), before another thread can check it's context and
>> so on, which explains everything coming to a standstill.
>>
>> If all permission checks execute in parallel independently, without blocking,
>> then the timeout won't be magnified.
>>
>> I am considering going one step further and replacing SocketPermission and
>> SocketPermissionCollection, and implementing DNS checks in the
>> SocketPermissionCollection rather than SocketPermission.  By doing this a
>> matching record will be found in most cases without requiring DNS reverse
>> lookup.  If I keep this as an internal policy implementation detail, then if
>> Oracle fixes SocketPermission, we can return to using the standard java
>> implementation, in fact I could make it a configuration property.
>>
>> It's an unfortunate fact that not all permission checks are performed in the
>> policy, replacing SocketPermission also requires the cooperation of the
>> SecurityManager.  To make matters worse, static ProtectionDomains created
>> prior to my policy implementation being constructed will never consult my
>> policy implementation as such they will still contain SocketPermission.   So
>> the SecurityManager would need to check each ProtectionDomain for both
>> implementations, so reimplementing SocketPermission doesn't eliminate its use
>> entirely.
>>
>> It's worth noting that SocketPermission is implemented rather poorly and the
>> same functionality can be provided with far fewer DNS lookups being performed,
>> since the majority are performed completely unnecessarily.  Perhaps it's worth
>> me donating some time to OpenJDK to fix it, I'd have to check with Apache
>> legal first I suppose.
>>
>> The problems with DNS lookup also affects CodeSource and URL equals and
>> hashcode methods, so these classes shouldn't be used in collections.
>>
>> Cheers,
>>
>> Peter.
>>
>> Christopher Dolan wrote:
>>> To simulate the problem, go to InetAddress.getHostFromNameService() in your
>>> IDE, set a breakpoint on the "nameService.getHostByAddr" line with a
>>> condition of something like this:
>>>
>>>       new java.util.concurrent.CountDownLatch(1).await(15,
>>> java.util.concurrent.TimeUnit.SECONDS)
>>>
>>> then launch your River application from within the IDE. This will cause all
>>> reverse DNS lookups to stall for 15 seconds before succeeding. This will
>>> affect Reggie the worst because it has to verify so many hostnames. In a
>>> large group (a few thousand services) this will drive Reggie's thread count
>>> skyward, perhaps triggering OutOfMemory errors if it's in a 32-bit JVM.
>>>
>>> This problem happens in the real world in facilities that allow client
>>> connections to the production LAN, but do not allow the production LAN to
>>> resolve hosts in the client LAN. This may occur due to separate IT teams or
>>> strict security rules or simple configuration errors. Because most
>>> client-server systems, like web servers, do not require the server to contact
>>> the client this problem does not become immediately visible to IT. Instead,
>>> the question is inevitably "Why is Jini/River so sensitive to reverse DNS?
>>> All of my other services work fine."
>>>
>>> Chris
>>>
>>> -----Original Message-----
>>> From: Tom Hobbs [mailto:tvhobbs@googlemail.com] Sent: Monday, December 12,
>>> 2011 1:43 PM
>>> To: dev@river.apache.org
>>> Subject: Re: RE: Implications for Security Checks - SocketPermission, URL and
>>> DNS lookups
>>>
>>> My biggest concern with such fundamental changes is controlling the impact
>>> it will have.  I'm a pretty good example of this, I haven't experienced the
>>> troubles these changes are intended to overcome.  I also don't havent made
>>> any attempt to dive into these areas of the code, for any reason.
>>>
>>> Is it possible to put together a test case which exposes these problems and
>>> also proves the solution?
>>>
>>> Obviously, a test case involving misconfigured networks is daft, in that
>>> instance a handy "if your network misconfigured" diagnostic tool or
>>> documentation would be a good idea.
>>>
>>> Please don't interpret this concern as a criticism of your work, Peter.
>>> Far from it.  It's just a comment born out of not really having any contact
>>> with the area your working in!
>>>
>>>
>>> Grammar and spelling have been sacrificed on the altar of messaging via
>>> mobile device.
>>>
>>> On 12 Dec 2011 18:01, "Christopher Dolan"<ch...@avid.com>
>>> wrote:
>>>
>>>> Specifically for SocketPermission, I experienced severe timeout problems
>>>> with reverse DNS misconfigurations. For some LAN-based deployments, I
>>>> relaxed this criterion via 'new SocketPermission("*",
>>>> "accept,listen,connect,resolve")'. This was difficult to apply to a general
>>>> Sun/Oracle JVM, however, because the default security policy *prepends* a
>>>> ("localhost:1024-","listen") permission that triggers the reverse DNS
>>>> lookup. To avoid this inconvenient setting, I install a new
>>>> java.security.Policy subclass that delegates to the default Policy except
>>>> when the incoming permission is a SocketPermission. That way I don't need
>>>> to modify the policy file in the JVM. The Policy.implies() override method
>>>> is trivial because it just needs to do " if (permission instanceof
>>>> SocketPermission) { ... }". The PermissionCollection methods were trickier
>>>> to override (skip over any SocketPermission elements in the default
>>>> Policy's PermissionCollection), but still only about 50 LOC.
>>>>
>>>> Chris
>>>>
>>>> -----Original Message-----
>>>> From: Peter Firmstone [mailto:jini@zeus.net.au]
>>>> Sent: Friday, December 09, 2011 9:28 PM
>>>> To: dev@river.apache.org
>>>> Subject: Implications for Security Checks - SocketPermission, URL and DNS
>>>> lookups
>>>>
>>>> DNS lookups and reverse lookups caused by URL and SocketPermission,
>>>> equals, hashCode and implies methods create some serious performance
>>>> problems for distributed programs.
>>>>
>>>> The concurrent policy implementation I've been working on reduces lock
>>>> contention between threads performing security checks.
>>>>
>>>> When the SecurityManager is used to check a guard, it calls the
>>>> AccessController, which retrieves the AccessControlContext from the call
>>>> stack, this contains all the ProtectionDomain's on the call stack (I
>>>> won't go into privileged calls here), if a ProtectionDomain is dynamic
>>>> it will consult the Policy, prior to checking the static permissions it
>>>> contains.
>>>>
>>>> The problem with the old policy implementation is lock contention caused
>>>> by multiple threads all using multiple ProtectionDomains, when the time
>>>> taken to perform a check is considerable, especially where identical
>>>> security checks might be performed by multiple threads executing the
>>>> same code.
>>>>
>>>> Although concurrent policy reduces contention between ProtectionDomain's
>>>> calls to Policy.implies, there remain some fundamental problems with the
>>>> implementations of SocketPermission and URL, that cause unnecessary DNS
>>>> lookups during equals(), hashCode() and implies() methods.
>>>>
>>>> The following bugs concern SocketPermission (please read before
>>>> continuing) :
>>>>
>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6592285
>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4975882 - contains a
>>>> lot of valuable comments.
>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4671007 - fixed,
>>>> perhaps incorrectly.
>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6501746
>>>>
>>>> Anyway to cut a long story short, DNS lookups and DNS reverse lookups
>>>> are performed for the equals and hashCode implementations in
>>>> SocketPermission and URL, with disastrous performance implications for
>>>> policy implementations using collections and caching security permission
>>>> check results.
>>>>
>>>> For example, once a SocketPermission guard has been checked for a
>>>> specific AccessContolContext the result is cached by my SecurityManager,
>>>> avoiding repeat security checks, however if that cache contains
>>>> SocketPermission, DNS lookups will be required, the cache will perform
>>>> slower than some other directly performed security checks!  The cache is
>>>> intended to return quickly to avoid reconsulting every ProtectionDomain
>>>> on the stack.
>>>>
>>>> To make matters worse, when checking a SocketPermission guard, the DNS
>>>> may be consulted for every non wild card SocketPermission contained
>>>> within a SocketPermissionCollection, up until it is implied.  DNS checks
>>>> are being made unnecessarily, since the wild card that matches may not
>>>> require a DNS lookup at all, but because the non matching
>>>> SocketPermission's are being checked first, the DNS lookups and reverse
>>>> lookups are still performed.  This could be fixed completely, by moving
>>>> the responsibility of DNS lookups from SocketPermission to
>>>> SocketPermissionCollection.
>>>>
>>>> The identity of two SocketPermission's are equal if they resolve to the
>>>> same IP address, but their hashCode's are different! See bug 6592623.
>>>>
>>>> The identity of a SocketPermission with an IP address and a DNS name,
>>>> resolving to identical IP address should not (in my opinion) be equal,
>>>> but is!  One SocketPermission should only imply the other while DNS
>>>> resolves to the same IP address, otherwise the equality of the two
>>>> SocketPermission's will change if the IP address is assigned to a
>>>> different domain!  Object equality / identity shouldn't depend on the
>>>> result of a possibly unreliable network source.
>>>>
>>>> SocketPermission and SocketPermissionCollection are broken, the only
>>>> solution I can think of is to re-implement these classes (from Harmony)
>>>> in the policy and SecurityManager, substituting the existing jvm
>>>> classes.  This would not be visible to client developers.
>>>>
>>>> SocketPermission's may also exist in a ProtectionDomain's static
>>>> Permissions, these would have to be converted by the policy when merging
>>>> the permissions from the ProtectionDomain with those from the policy.
>>>> Since ProtectionDomain, attempts to check it's own internal permissions,
>>>> after the policy permission check fails, DNS checks are currently
>>>> performed by duplicate SocketPermission's residing in the
>>>> ProectionDomain, this will no longer occur, since the permission being
>>>> checked will be converted to say for argument sake
>>>> org.apache.river.security.SocketPermission.  However because some
>>>> ProtectionDomains are static, they never consult the policy, so the
>>>> Permission's contained in each ProtectionDomain will require conversion
>>>> also, to do so will require extending and implementing a
>>>> ProtectionDomain that encapsulates existing ProtectionDomain's in the
>>>> AccessControlContext, by utilising a DomainCombiner.
>>>>
>>>> For CodeSource grant's, the policy file based grant's are defined by
>>>> URL's, however URL's identity depend upon DNS record results, similar to
>>>> SocketPermission equals and hashCode implementations which we have no
>>>> control over.
>>>>
>>>> I'm thinking about implementing URI based grant's instead, to avoid DNS
>>>> lookups, then allowing a policy compatibility mode to be enabled (with
>>>> logging) for falling back to CodeSource grant's when a URL cannot be
>>>> converted to a URI, this is a much simpler fix than the SocketPermission
>>>> problem.
>>>>
>>>> For Dynamic Policy Grants, because ProtectionDomain doesn't override
>>>> equals (that's a good thing), the contained CodeSource must also be
>>>> checked, again potentially slowing down permission checks with DNS
>>>> lookups, simply because CodeSource uses URL's.  Changing the Dynamic
>>>> Grant's to use URI based comparison would be relatively simple, since
>>>> the URI is obtained dynamically when the dynamic grant is created.
>>>>
>>>> URI based grant's don't use DNS resolution and would have a narrower
>>>> scope of implied CodeSources, an IP based grant won't imply a DNS domain
>>>> URL based CodeSource and vice versa.  Rather than rely on DNS
>>>> resolution, grant's could be made specifically for IPv4, IPv6 and DNS
>>>> names in policy files.  URL.toURI() can be utilised to check if URI
>>>> grant's imply a CodeSource without resorting to DNS.
>>>>
>>>> Any thoughts, comments or ideas?
>>>>
>>>> N.B. It's sad that security is implemented the way it is, it would be
>>>> far better if it was Executor based, since every protection domain could
>>>> be checked in parallel, rather than in sequence.
>>>>
>>>> Regards,
>>>>
>>>> Peter.
>>>>
>>>>
>>>>
>>
>


Re: Implications for Security Checks - SocketPermission, URL and DNS lookups

Posted by Dan Creswell <da...@gmail.com>.
On 24 December 2011 11:04, Peter Firmstone <ji...@zeus.net.au> wrote:
> Dan Creswell wrote:
>>
>> So...
>>
>> On 23 December 2011 11:32, Peter Firmstone <ji...@zeus.net.au> wrote:
>>
>>>
>>> Hmmm, scratches beard, ok, you're right, up for some brainstorming?
>>>
>>>  1. If I reimplement SocketPermission, what sort of behaviour do we need?
>>>  2. Or a faster DNS provider? www.xbill.org/dnsjava -
>>>    *sun.net.spi.nameservice.provider.1=dns,dnsjava*
>>>
>>>
>>
>>
>> Well, I'm always going to lean towards fixing the root cause of the
>> problem which IMHO is DNS and its usage/performance in JDK. Which
>> means that a faster or at least smarter provider will be where I'd
>> want to go. JDK's default cache approach is kinda busted in any case.
>>
>
>
> A faster provider will help, and we can bundle dnsjava if we want,
> unfortunately though DNS lookup is oversubscribed in Sun's jdk, so we need
> to eliminate the sources of overuse also.

I don't think that holds necessarily if DNS lookup is good enough.

>
> I was able to successfully replace URL with URI in PreferredClassProvider
> and in CodeSource.implies(CodeSource cs) with URLGrant.implies(URI grant,
> URI implied), this avoids DNS lookup completely.
>
> With regard to a SocketPermissionCollection, if you have a number of domain
> name granted SocketPermission's, that aren't wildcards, each one will cause
> an unnecessary DNS lookup, until a suitable match is found.  Even domain
> names that are obviously different are resolved, in case they share an
> identical ip address.  Even identical domain addresses are resolved, to see
> if their IP addresses match.  If one has a IP address and the other a domain
> name, SocketPermission.implies will perform a dns lookup to get the missing
> IP address and then a  reverse dns lookup to obtain the other domain name,
> if the IP addresses don't match.
>
> A reimplementation of SocketPermission would allow the
> SocketPermissionCollection to take responsibility for DNS lookup after it
> checks every SocketPermission for a direct match without DNS: wildcards, IP
> address etc.  Make the domain name implies a separate check that's performed
> after all wildcards and IP addresses.
>
>
>
>>> A Comparator is good for ensuring the Permission object's are sorted into
>>> an
>>> efficient order before creating a PermissionCollection.  The Comparator
>>> isn't much good for a cache that contains a previously checked
>>> Permission,
>>> since equals will be executed (I don't currently cache SocketPermission
>>> for
>>> this reason).  Collection.contains(Permission p)?
>>>
>>
>>
>> Collection.contains will surely use equals()?
>>
>
>
> In externally sorted Collections, yes, but not in TreeSet, so that's an
> option, thanks ;)  Pays to read the docs properly.
>
>
>>
>>>
>>> With a SecurityManager and our own PolicyFile implementation, it is
>>> possible
>>> to replace / substitute SocketPermission, from both ends, but both the
>>> policy and SM must be in place or it won't work.  PolicyFile must be
>>>
>>
>>
>> Smells like we're heading towards part of a standard platform. A
>> security manager and policy generally need to be available to support
>> downloadable code so I don't see this being an issue?
>>
>
>
> Yep, got that smell about it.
>
>>
>>>
>>> instantiated early, or we risk having static ProtectionDomain's that
>>> still
>>> contain java.net.SocketPermission.  ConcurrentPermissions, a replacement
>>> for
>>> Permissions that ProtectionDomain's use to hold static Permissions could
>>> also be used to convert any stray SocketPermission objects.
>>>
>>> One question I've asked myself when creating my own policy implementation
>>> was CodeSource.implies(CodeSource cs), the implementation seemed like a
>>> bad
>>> idea, it uses DNS, an attacker could use DNS cache poisoning to gain
>>> elevated permission using an untrusted CodeSource URL, simply because the
>>> policy thinks the CodeSource is implied.  I changed PolicyFile to
>>> specifically not use CodeSource.implies().  In reality a signer
>>> Certificate
>>> is required to identify the CodeSource with any level of trust.
>>>
>>>
>>
>>
>> Well, I think a more general point here would be that JDK's default
>> set of behaviours are designed to "protect" against DNS based attacks
>> (i.e. a successful lookup result is cached forever and so changes
>> can't leak in). This is bogus, because if the first lookup is
>> compromised you're dead and buried.
>>
>> The correct solution (and more practical these days) is to properly
>> secure your DNS.
>>
>> Which brings me to a general statement in respect of DNS security - do
>> it in that system, don't attempt to compensate in the application. Any
>> firm that generally cares about security will have done this
>> already....
>>
>
>
> Then there's the internet, where DNS can't be trusted.  The current root key
> certificate system in DNS-SEC could also be compromised at some point in the
> future.  Certificate authorities are proving that they can't be trusted.
>

Then you run your own DNS, and detach it from external factors....

> The only system that appears to be resistant is OpenPGP's model, a web of
> trust, with certificate revocation.
>
>
>>
>>>
>>> Now a IPv4 address can be converted to an IPv6, so IP addresses could be
>>> converted to IPv6 format and compared.  Host names could be compared
>>> using
>>> string comparison.  But without DNS an IP address couldn't equal a domain
>>> name and a domain name couldn't be resolved to imply an IP address.
>>>
>>> The intended purpose of SocketPermission is to check if a user and or
>>> code
>>> is allowed to connect, listen, etc to a network address.  How can we
>>> trust
>>> the DNS to give the right information?
>>>
>>
>>
>> See above, DNS can give you the wrong information because it's
>> mis-configured, this isn't just a security problem.
>>
>>
>>>
>>> How do we know the DNS has resolved a domain name correctly?
>>>
>>>
>>
>>
>> You don't unless it's secured and you're confident spoofing options
>> are eliminated (which means you'd need to be sure physical network
>> compromise for example is reasonably addressed).
>>
>>
>>>
>>> The most logical way to identify the remote end, would be via a
>>> connection
>>> that requires it to authenticate.  We have that now with secure JERI.
>>>
>>
>>
>> And secure JERI handshakes are horribly slow (to be fair the
>> underlying protocols produce that performance envelope).
>>
>
>
> Hmm, sounds like a future opportunity, I wonder if elliptical curve
> cryptography could help.
>
>
>>
>>>
>>> SocketPermission only makes a decision whether to allow a connection or
>>> not,
>>> that's it.
>>>
>>> Without DNS, the policy admin would have to enter SocketPermission grants
>>> for domain names and IP addresses (manual duplication), so it seems DNS
>>> is
>>> there for convenience.
>>>
>>>
>>
>>
>> DNS is, and always has been, a convenience - admins back in the day
>> wanted something that would provide the equivalent of /etc/hosts
>> across large numbers of machines for low effort.
>>
>>
>>>
>>> Using RemotePolicy for a djinn group, we could have an administrator
>>> node,
>>> resolve all current domain names (and reverse lookup IP addresses) in the
>>> djinn policy file and update all group member nodes with duplicated
>>> SocketPermission's for IP address and domain name forms.  Then none of
>>> the
>>> nodes would need to perform DNS resolution.  Again that requires our own
>>> SocketPermission implementation.
>>>
>>
>>
>> Meh, I can't believe that's more performant than having each box doing
>> direct DNS resolution for itself....
>>
>
>
> If you only need to update your RemotePolicy occasionally and you can
> resolve all addresses off line prior to performing a policy update, then DNS
> isn't required in the middle of a SecurityCheck in progress...

That's just a hosts file and DNS disabled....

And you'd rsync it around or similar.

You'd of course be messed around by JDK's default caching policy but...

>
> Background pre-processed, rather than on demand, so to speak.
>
> RemotePolicy doesn't make policy decisions, it's a method of transferring
> Policy Permission grant's.  So while a policy update is in progress,
> security checks may continue unheeded, the switch is made at the last
> moment, once all information has been transferred and cached locally.
>  Perhaps I should rename it?  DjinnGroupPolicy?  You limit the Permission's
> the Djinn Group administrator can grant using GrantPermission.  Although
> there is a one is to one relationship between RemotePolicy and Group.
> RemotePolicy isn't completely implemented.
>
> If DNS is reduced to the absolute minimum, perhaps it could be preprocessed
> locally too.  Once we start performing security checks, we need to be
> decisive.

Still a hosts file....

>
> I'd welcome a helping hand ;)
>
> Merry Christmas & Cheers,
>
> Peter.
>
>>
>>>
>>> Cheers,
>>>
>>> Peter.
>>>
>>>
>>>
>>>
>>> Dan Creswell wrote:
>>>
>>>>
>>>> IMHO, workarounds like this are asking for trouble. You're assuming
>>>> certain rational actions (a decent toString) on behalf of some other
>>>> programmer in the presence of evidence that says they aren't rational
>>>> - i.e. they have a poor equals and/or a poor hashCode implementation.
>>>>
>>>> Combine that with putting the mechanism "under the covers" and I feel
>>>> that's a nasty piece of dark magic brewing that'll give us problems
>>>> later.
>>>>
>>>> An explicit workaround option is supported in typical collections via
>>>> a Comparator, yes, that means others have to write some code but it
>>>> also means the troubles they're facing with equals and hashCode are
>>>> "in your face".
>>>>
>>>> Happy Christmas,
>>>>
>>>> Dan.
>>>>
>>>> On 23 December 2011 02:06, Peter Firmstone <ji...@zeus.net.au> wrote:
>>>>
>>>>
>>>>>
>>>>> There's another way around poorly written equals() and hashCode()
>>>>> implementations.
>>>>>
>>>>> In my reference collection utilities, I have strong, weak and soft
>>>>> references, there are variations on these based on identity or
>>>>> equality.
>>>>>
>>>>> Well, I've just thought of another that might help out when poor equals
>>>>> implementations exist:
>>>>>
>>>>> toString()?
>>>>>
>>>>> first check both objects have the same class, then compare the results
>>>>> of
>>>>> toString(), and use toString().hashCode() for hashCode's.
>>>>>
>>>>> I could call this String equality, when toString isn't overridden it
>>>>> prints
>>>>> the reference address so this is compatible with identity based
>>>>> equality
>>>>> also.
>>>>>
>>>>> This would fix all those nasty equals implementations for use in
>>>>> collections
>>>>> without requiring any work on the developers part.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Peter.
>>>>>
>>>>>
>>>>>
>>>>> Peter wrote:
>>>>>
>>>>>
>>>>>>
>>>>>> SocketPermissionCollection adds SocketPermission at the head of its
>>>>>> internal list.  This change was made in jdk 1.2.2_005  bug 4301064
>>>>>> related
>>>>>> to reverse dns lookup delays for applets.
>>>>>>
>>>>>> This indicates that the tail of the last policy file parsed, is added
>>>>>> last
>>>>>> to the policy and hence at the head of that List.
>>>>>>
>>>>>> It's  also worth noting that the standard policy provider included
>>>>>> with
>>>>>> the jvm is in force until the preferred policy provider is completely
>>>>>> initiated, after reading all policy files.  So it's likely that the
>>>>>> standard
>>>>>> java policy is read last by our policy provider implementation.
>>>>>>
>>>>>> In summary a list of SocketPermissions need to be sorted beginning
>>>>>> from
>>>>>> those that cause long dns delays, to wildcard based permissions, so
>>>>>> the
>>>>>> wildcard perms are added last and hence checked first by any implies
>>>>>> calls.
>>>>>>
>>>>>> I've got two options on how to solve this:
>>>>>>
>>>>>> 1.  Get rid of PermissionCollection based caches altogether and
>>>>>> generate
>>>>>> PermissionCollection's on demand.
>>>>>>
>>>>>> 2  Replace the PermissionCollection cache with a List<Permission>
>>>>>> based
>>>>>> cache, generate Permissioncollection's on demand.  Sort the List after
>>>>>> creation, before publishing, replace the list on write.
>>>>>>
>>>>>> Option 2 could be implemented in ConcurrentPermissions, a replacement
>>>>>> for
>>>>>> java.security.Permissions.
>>>>>>
>>>>>> Option 1 would be implemented by the policy.
>>>>>>
>>>>>> In addition, to allow the security manager to cache the results of
>>>>>> permission checks for SocketPermission, I can create a wrapper class,
>>>>>> where
>>>>>> equals and hashcode are based purely on the string representation.
>>>>>>  This
>>>>>> allows very rapid repeated permission checks.
>>>>>>
>>>>>> Looks like I can get around the SocketPermission, CodeSource and URL
>>>>>> headaches, relatively unscathed.
>>>>>> N.B. Anyone care to try out, or seriously performance test the new
>>>>>> PreferredClassProvider?
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> Peter.
>>>>>>
>>>>>> ----- Original message -----
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Actually, more significantly for me is that the default localhost
>>>>>>> SocketPermission is checked before a more lenient SocketPermission.
>>>>>>> In
>>>>>>> theory,
>>>>>>> one should be able to introspect SocketPermission instances and
>>>>>>> determine
>>>>>>> that
>>>>>>> one may be automatically implied by the other so can be skipped,
>>>>>>> possibly
>>>>>>> saving
>>>>>>> a lookup. Chris
>>>>>>>
>>>>>>> Peter Firmstone wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> A big problem with the current implementation is SocketPermission
>>>>>>>> blocks
>>>>>>>> other permission checks from proceeding.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
>

Re: Implications for Security Checks - SocketPermission, URL and DNS lookups

Posted by Peter Firmstone <ji...@zeus.net.au>.
Dan Creswell wrote:
> So...
>
> On 23 December 2011 11:32, Peter Firmstone <ji...@zeus.net.au> wrote:
>   
>> Hmmm, scratches beard, ok, you're right, up for some brainstorming?
>>
>>  1. If I reimplement SocketPermission, what sort of behaviour do we need?
>>  2. Or a faster DNS provider? www.xbill.org/dnsjava -
>>     *sun.net.spi.nameservice.provider.1=dns,dnsjava*
>>
>>     
>
> Well, I'm always going to lean towards fixing the root cause of the
> problem which IMHO is DNS and its usage/performance in JDK. Which
> means that a faster or at least smarter provider will be where I'd
> want to go. JDK's default cache approach is kinda busted in any case.
>   

A faster provider will help, and we can bundle dnsjava if we want, 
unfortunately though DNS lookup is oversubscribed in Sun's jdk, so we 
need to eliminate the sources of overuse also.

I was able to successfully replace URL with URI in 
PreferredClassProvider and in CodeSource.implies(CodeSource cs) with 
URLGrant.implies(URI grant, URI implied), this avoids DNS lookup completely.

With regard to a SocketPermissionCollection, if you have a number of 
domain name granted SocketPermission's, that aren't wildcards, each one 
will cause an unnecessary DNS lookup, until a suitable match is found.  
Even domain names that are obviously different are resolved, in case 
they share an identical ip address.  Even identical domain addresses are 
resolved, to see if their IP addresses match.  If one has a IP address 
and the other a domain name, SocketPermission.implies will perform a dns 
lookup to get the missing IP address and then a  reverse dns lookup to 
obtain the other domain name, if the IP addresses don't match.

A reimplementation of SocketPermission would allow the 
SocketPermissionCollection to take responsibility for DNS lookup after 
it checks every SocketPermission for a direct match without DNS: 
wildcards, IP address etc.  Make the domain name implies a separate 
check that's performed after all wildcards and IP addresses.


>> A Comparator is good for ensuring the Permission object's are sorted into an
>> efficient order before creating a PermissionCollection.  The Comparator
>> isn't much good for a cache that contains a previously checked Permission,
>> since equals will be executed (I don't currently cache SocketPermission for
>> this reason).  Collection.contains(Permission p)?
>>     
>
> Collection.contains will surely use equals()?
>   

In externally sorted Collections, yes, but not in TreeSet, so that's an 
option, thanks ;)  Pays to read the docs properly.

>   
>> With a SecurityManager and our own PolicyFile implementation, it is possible
>> to replace / substitute SocketPermission, from both ends, but both the
>> policy and SM must be in place or it won't work.  PolicyFile must be
>>     
>
> Smells like we're heading towards part of a standard platform. A
> security manager and policy generally need to be available to support
> downloadable code so I don't see this being an issue?
>   

Yep, got that smell about it.
>   
>> instantiated early, or we risk having static ProtectionDomain's that still
>> contain java.net.SocketPermission.  ConcurrentPermissions, a replacement for
>> Permissions that ProtectionDomain's use to hold static Permissions could
>> also be used to convert any stray SocketPermission objects.
>>
>> One question I've asked myself when creating my own policy implementation
>> was CodeSource.implies(CodeSource cs), the implementation seemed like a bad
>> idea, it uses DNS, an attacker could use DNS cache poisoning to gain
>> elevated permission using an untrusted CodeSource URL, simply because the
>> policy thinks the CodeSource is implied.  I changed PolicyFile to
>> specifically not use CodeSource.implies().  In reality a signer Certificate
>> is required to identify the CodeSource with any level of trust.
>>
>>     
>
> Well, I think a more general point here would be that JDK's default
> set of behaviours are designed to "protect" against DNS based attacks
> (i.e. a successful lookup result is cached forever and so changes
> can't leak in). This is bogus, because if the first lookup is
> compromised you're dead and buried.
>
> The correct solution (and more practical these days) is to properly
> secure your DNS.
>
> Which brings me to a general statement in respect of DNS security - do
> it in that system, don't attempt to compensate in the application. Any
> firm that generally cares about security will have done this
> already....
>   

Then there's the internet, where DNS can't be trusted.  The current root 
key certificate system in DNS-SEC could also be compromised at some 
point in the future.  Certificate authorities are proving that they 
can't be trusted.

The only system that appears to be resistant is OpenPGP's model, a web 
of trust, with certificate revocation.

>   
>> Now a IPv4 address can be converted to an IPv6, so IP addresses could be
>> converted to IPv6 format and compared.  Host names could be compared using
>> string comparison.  But without DNS an IP address couldn't equal a domain
>> name and a domain name couldn't be resolved to imply an IP address.
>>
>> The intended purpose of SocketPermission is to check if a user and or code
>> is allowed to connect, listen, etc to a network address.  How can we trust
>> the DNS to give the right information?
>>     
>
> See above, DNS can give you the wrong information because it's
> mis-configured, this isn't just a security problem.
>
>   
>> How do we know the DNS has resolved a domain name correctly?
>>
>>     
>
> You don't unless it's secured and you're confident spoofing options
> are eliminated (which means you'd need to be sure physical network
> compromise for example is reasonably addressed).
>
>   
>> The most logical way to identify the remote end, would be via a connection
>> that requires it to authenticate.  We have that now with secure JERI.
>>     
>
> And secure JERI handshakes are horribly slow (to be fair the
> underlying protocols produce that performance envelope).
>   

Hmm, sounds like a future opportunity, I wonder if elliptical curve 
cryptography could help.

>   
>> SocketPermission only makes a decision whether to allow a connection or not,
>> that's it.
>>
>> Without DNS, the policy admin would have to enter SocketPermission grants
>> for domain names and IP addresses (manual duplication), so it seems DNS is
>> there for convenience.
>>
>>     
>
> DNS is, and always has been, a convenience - admins back in the day
> wanted something that would provide the equivalent of /etc/hosts
> across large numbers of machines for low effort.
>
>   
>> Using RemotePolicy for a djinn group, we could have an administrator node,
>> resolve all current domain names (and reverse lookup IP addresses) in the
>> djinn policy file and update all group member nodes with duplicated
>> SocketPermission's for IP address and domain name forms.  Then none of the
>> nodes would need to perform DNS resolution.  Again that requires our own
>> SocketPermission implementation.
>>     
>
> Meh, I can't believe that's more performant than having each box doing
> direct DNS resolution for itself....
>   

If you only need to update your RemotePolicy occasionally and you can 
resolve all addresses off line prior to performing a policy update, then 
DNS isn't required in the middle of a SecurityCheck in progress...

Background pre-processed, rather than on demand, so to speak.

RemotePolicy doesn't make policy decisions, it's a method of 
transferring Policy Permission grant's.  So while a policy update is in 
progress, security checks may continue unheeded, the switch is made at 
the last moment, once all information has been transferred and cached 
locally.  Perhaps I should rename it?  DjinnGroupPolicy?  You limit the 
Permission's the Djinn Group administrator can grant using 
GrantPermission.  Although there is a one is to one relationship between 
RemotePolicy and Group. RemotePolicy isn't completely implemented.

If DNS is reduced to the absolute minimum, perhaps it could be 
preprocessed locally too.  Once we start performing security checks, we 
need to be decisive.

I'd welcome a helping hand ;)

Merry Christmas & Cheers,

Peter.
>   
>> Cheers,
>>
>> Peter.
>>
>>
>>
>>
>> Dan Creswell wrote:
>>     
>>> IMHO, workarounds like this are asking for trouble. You're assuming
>>> certain rational actions (a decent toString) on behalf of some other
>>> programmer in the presence of evidence that says they aren't rational
>>> - i.e. they have a poor equals and/or a poor hashCode implementation.
>>>
>>> Combine that with putting the mechanism "under the covers" and I feel
>>> that's a nasty piece of dark magic brewing that'll give us problems
>>> later.
>>>
>>> An explicit workaround option is supported in typical collections via
>>> a Comparator, yes, that means others have to write some code but it
>>> also means the troubles they're facing with equals and hashCode are
>>> "in your face".
>>>
>>> Happy Christmas,
>>>
>>> Dan.
>>>
>>> On 23 December 2011 02:06, Peter Firmstone <ji...@zeus.net.au> wrote:
>>>
>>>       
>>>> There's another way around poorly written equals() and hashCode()
>>>> implementations.
>>>>
>>>> In my reference collection utilities, I have strong, weak and soft
>>>> references, there are variations on these based on identity or equality.
>>>>
>>>> Well, I've just thought of another that might help out when poor equals
>>>> implementations exist:
>>>>
>>>> toString()?
>>>>
>>>> first check both objects have the same class, then compare the results of
>>>> toString(), and use toString().hashCode() for hashCode's.
>>>>
>>>> I could call this String equality, when toString isn't overridden it
>>>> prints
>>>> the reference address so this is compatible with identity based equality
>>>> also.
>>>>
>>>> This would fix all those nasty equals implementations for use in
>>>> collections
>>>> without requiring any work on the developers part.
>>>>
>>>> Cheers,
>>>>
>>>> Peter.
>>>>
>>>>
>>>>
>>>> Peter wrote:
>>>>
>>>>         
>>>>> SocketPermissionCollection adds SocketPermission at the head of its
>>>>> internal list.  This change was made in jdk 1.2.2_005  bug 4301064
>>>>> related
>>>>> to reverse dns lookup delays for applets.
>>>>>
>>>>> This indicates that the tail of the last policy file parsed, is added
>>>>> last
>>>>> to the policy and hence at the head of that List.
>>>>>
>>>>> It's  also worth noting that the standard policy provider included with
>>>>> the jvm is in force until the preferred policy provider is completely
>>>>> initiated, after reading all policy files.  So it's likely that the
>>>>> standard
>>>>> java policy is read last by our policy provider implementation.
>>>>>
>>>>> In summary a list of SocketPermissions need to be sorted beginning from
>>>>> those that cause long dns delays, to wildcard based permissions, so the
>>>>> wildcard perms are added last and hence checked first by any implies
>>>>> calls.
>>>>>
>>>>> I've got two options on how to solve this:
>>>>>
>>>>> 1.  Get rid of PermissionCollection based caches altogether and generate
>>>>> PermissionCollection's on demand.
>>>>>
>>>>> 2  Replace the PermissionCollection cache with a List<Permission> based
>>>>> cache, generate Permissioncollection's on demand.  Sort the List after
>>>>> creation, before publishing, replace the list on write.
>>>>>
>>>>> Option 2 could be implemented in ConcurrentPermissions, a replacement
>>>>> for
>>>>> java.security.Permissions.
>>>>>
>>>>> Option 1 would be implemented by the policy.
>>>>>
>>>>> In addition, to allow the security manager to cache the results of
>>>>> permission checks for SocketPermission, I can create a wrapper class,
>>>>> where
>>>>> equals and hashcode are based purely on the string representation.  This
>>>>> allows very rapid repeated permission checks.
>>>>>
>>>>> Looks like I can get around the SocketPermission, CodeSource and URL
>>>>> headaches, relatively unscathed.
>>>>> N.B. Anyone care to try out, or seriously performance test the new
>>>>> PreferredClassProvider?
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Peter.
>>>>>
>>>>> ----- Original message -----
>>>>>
>>>>>
>>>>>           
>>>>>> Actually, more significantly for me is that the default localhost
>>>>>> SocketPermission is checked before a more lenient SocketPermission. In
>>>>>> theory,
>>>>>> one should be able to introspect SocketPermission instances and
>>>>>> determine
>>>>>> that
>>>>>> one may be automatically implied by the other so can be skipped,
>>>>>> possibly
>>>>>> saving
>>>>>> a lookup. Chris
>>>>>>
>>>>>> Peter Firmstone wrote:
>>>>>>
>>>>>>
>>>>>>             
>>>>>>> A big problem with the current implementation is SocketPermission
>>>>>>> blocks
>>>>>>> other permission checks from proceeding.
>>>>>>>
>>>>>>>
>>>>>>>               
>>>>>
>>>>>
>>>>>           
>>>>         
>>>
>>>       
>>     
>
>   


Re: Implications for Security Checks - SocketPermission, URL and DNS lookups

Posted by Peter <ji...@zeus.net.au>.
----- Original message -----
> On 12/24/2011 2:14 AM, Dan Creswell wrote:
> > So...
> >
> > On 23 December 2011 11:32, Peter Firmstone<ji...@zeus.net.au>  wrote:
> > One question I've asked myself when creating my own policy implementation
> > was CodeSource.implies(CodeSource cs), the implementation seemed like a bad
> > idea, it uses DNS, an attacker could use DNS cache poisoning to gain
> > elevated permission using an untrusted CodeSource URL, simply because the
> > policy thinks the CodeSource is implied.  I changed PolicyFile to
> > specifically not use CodeSource.implies().  In reality a signer Certificate
> > is required to identify the CodeSource with any level of trust.
> >
> > Well, I think a more general point here would be that JDK's default
> > set of behaviours are designed to "protect" against DNS based attacks
> > (i.e. a successful lookup result is cached forever and so changes
> > can't leak in). This is bogus, because if the first lookup is
> > compromised you're dead and buried.
> I think it's fundamental to understand that a lot of the DNS caching behavior
> was born in the Applet world.  When Java first hit the scenes, we had the
> problem that people could demonstrate that they could "know" the address of the
> socket on the remote end, and thus could use that (this was before NAT was in
> use, or at least wide spread), and poison the DNS so that subsequent lookups
> returned addresses on the local network, instead of the correct address of the
> original server.
>
> That's one path of exploitation, but as Dan says, there are others in the Jini
> world where the first lookup, being poisoned, can cause exploitative code to be
> downloaded.
>
> I think that it's vital to understand, that whether you cache the first, second
> or fifth lookup, each situation presents a different set  of challenges in
> providing security.  Ultimately, Jini needs, in my opinion, to focus
> authentication above the network layer, and use signed jars, encrypted paths,
> and cert based auth, so that the network path, can not be a part of the
> exploitation, and instead, each end of a "communication", is responsible for
> trusting the other, through negotiations carried through the network, instead of
> using information about the network to guarantee trust.
>
> Gregg Wonderly

+1 Well said, my thoughts exactly, grant Permission to Certificate & Principal combinations.

We might need to work towards a PGP trust model.

Cheers & Merry Christmas,

Pete.

Re: Implications for Security Checks - SocketPermission, URL and DNS lookups

Posted by Gregg Wonderly <gr...@wonderly.org>.
On 12/24/2011 2:14 AM, Dan Creswell wrote:
> So...
>
> On 23 December 2011 11:32, Peter Firmstone<ji...@zeus.net.au>  wrote:
> One question I've asked myself when creating my own policy implementation
> was CodeSource.implies(CodeSource cs), the implementation seemed like a bad
> idea, it uses DNS, an attacker could use DNS cache poisoning to gain
> elevated permission using an untrusted CodeSource URL, simply because the
> policy thinks the CodeSource is implied.  I changed PolicyFile to
> specifically not use CodeSource.implies().  In reality a signer Certificate
> is required to identify the CodeSource with any level of trust.
>
> Well, I think a more general point here would be that JDK's default
> set of behaviours are designed to "protect" against DNS based attacks
> (i.e. a successful lookup result is cached forever and so changes
> can't leak in). This is bogus, because if the first lookup is
> compromised you're dead and buried.
I think it's fundamental to understand that a lot of the DNS caching behavior 
was born in the Applet world.  When Java first hit the scenes, we had the 
problem that people could demonstrate that they could "know" the address of the 
socket on the remote end, and thus could use that (this was before NAT was in 
use, or at least wide spread), and poison the DNS so that subsequent lookups 
returned addresses on the local network, instead of the correct address of the 
original server.

That's one path of exploitation, but as Dan says, there are others in the Jini 
world where the first lookup, being poisoned, can cause exploitative code to be 
downloaded.

I think that it's vital to understand, that whether you cache the first, second 
or fifth lookup, each situation presents a different set  of challenges in 
providing security.  Ultimately, Jini needs, in my opinion, to focus 
authentication above the network layer, and use signed jars, encrypted paths, 
and cert based auth, so that the network path, can not be a part of the 
exploitation, and instead, each end of a "communication", is responsible for 
trusting the other, through negotiations carried through the network, instead of 
using information about the network to guarantee trust.

Gregg Wonderly

Re: Implications for Security Checks - SocketPermission, URL and DNS lookups

Posted by Dan Creswell <da...@gmail.com>.
So...

On 23 December 2011 11:32, Peter Firmstone <ji...@zeus.net.au> wrote:
> Hmmm, scratches beard, ok, you're right, up for some brainstorming?
>
>  1. If I reimplement SocketPermission, what sort of behaviour do we need?
>  2. Or a faster DNS provider? www.xbill.org/dnsjava -
>     *sun.net.spi.nameservice.provider.1=dns,dnsjava*
>

Well, I'm always going to lean towards fixing the root cause of the
problem which IMHO is DNS and its usage/performance in JDK. Which
means that a faster or at least smarter provider will be where I'd
want to go. JDK's default cache approach is kinda busted in any case.

> A Comparator is good for ensuring the Permission object's are sorted into an
> efficient order before creating a PermissionCollection.  The Comparator
> isn't much good for a cache that contains a previously checked Permission,
> since equals will be executed (I don't currently cache SocketPermission for
> this reason).  Collection.contains(Permission p)?

Collection.contains will surely use equals()?

>
> With a SecurityManager and our own PolicyFile implementation, it is possible
> to replace / substitute SocketPermission, from both ends, but both the
> policy and SM must be in place or it won't work.  PolicyFile must be

Smells like we're heading towards part of a standard platform. A
security manager and policy generally need to be available to support
downloadable code so I don't see this being an issue?

> instantiated early, or we risk having static ProtectionDomain's that still
> contain java.net.SocketPermission.  ConcurrentPermissions, a replacement for
> Permissions that ProtectionDomain's use to hold static Permissions could
> also be used to convert any stray SocketPermission objects.
>
> One question I've asked myself when creating my own policy implementation
> was CodeSource.implies(CodeSource cs), the implementation seemed like a bad
> idea, it uses DNS, an attacker could use DNS cache poisoning to gain
> elevated permission using an untrusted CodeSource URL, simply because the
> policy thinks the CodeSource is implied.  I changed PolicyFile to
> specifically not use CodeSource.implies().  In reality a signer Certificate
> is required to identify the CodeSource with any level of trust.
>

Well, I think a more general point here would be that JDK's default
set of behaviours are designed to "protect" against DNS based attacks
(i.e. a successful lookup result is cached forever and so changes
can't leak in). This is bogus, because if the first lookup is
compromised you're dead and buried.

The correct solution (and more practical these days) is to properly
secure your DNS.

Which brings me to a general statement in respect of DNS security - do
it in that system, don't attempt to compensate in the application. Any
firm that generally cares about security will have done this
already....

> Now a IPv4 address can be converted to an IPv6, so IP addresses could be
> converted to IPv6 format and compared.  Host names could be compared using
> string comparison.  But without DNS an IP address couldn't equal a domain
> name and a domain name couldn't be resolved to imply an IP address.
>
> The intended purpose of SocketPermission is to check if a user and or code
> is allowed to connect, listen, etc to a network address.  How can we trust
> the DNS to give the right information?

See above, DNS can give you the wrong information because it's
mis-configured, this isn't just a security problem.

>
> How do we know the DNS has resolved a domain name correctly?
>

You don't unless it's secured and you're confident spoofing options
are eliminated (which means you'd need to be sure physical network
compromise for example is reasonably addressed).

> The most logical way to identify the remote end, would be via a connection
> that requires it to authenticate.  We have that now with secure JERI.

And secure JERI handshakes are horribly slow (to be fair the
underlying protocols produce that performance envelope).

>
> SocketPermission only makes a decision whether to allow a connection or not,
> that's it.
>
> Without DNS, the policy admin would have to enter SocketPermission grants
> for domain names and IP addresses (manual duplication), so it seems DNS is
> there for convenience.
>

DNS is, and always has been, a convenience - admins back in the day
wanted something that would provide the equivalent of /etc/hosts
across large numbers of machines for low effort.

> Using RemotePolicy for a djinn group, we could have an administrator node,
> resolve all current domain names (and reverse lookup IP addresses) in the
> djinn policy file and update all group member nodes with duplicated
> SocketPermission's for IP address and domain name forms.  Then none of the
> nodes would need to perform DNS resolution.  Again that requires our own
> SocketPermission implementation.

Meh, I can't believe that's more performant than having each box doing
direct DNS resolution for itself....

>
> Cheers,
>
> Peter.
>
>
>
>
> Dan Creswell wrote:
>>
>> IMHO, workarounds like this are asking for trouble. You're assuming
>> certain rational actions (a decent toString) on behalf of some other
>> programmer in the presence of evidence that says they aren't rational
>> - i.e. they have a poor equals and/or a poor hashCode implementation.
>>
>> Combine that with putting the mechanism "under the covers" and I feel
>> that's a nasty piece of dark magic brewing that'll give us problems
>> later.
>>
>> An explicit workaround option is supported in typical collections via
>> a Comparator, yes, that means others have to write some code but it
>> also means the troubles they're facing with equals and hashCode are
>> "in your face".
>>
>> Happy Christmas,
>>
>> Dan.
>>
>> On 23 December 2011 02:06, Peter Firmstone <ji...@zeus.net.au> wrote:
>>
>>>
>>> There's another way around poorly written equals() and hashCode()
>>> implementations.
>>>
>>> In my reference collection utilities, I have strong, weak and soft
>>> references, there are variations on these based on identity or equality.
>>>
>>> Well, I've just thought of another that might help out when poor equals
>>> implementations exist:
>>>
>>> toString()?
>>>
>>> first check both objects have the same class, then compare the results of
>>> toString(), and use toString().hashCode() for hashCode's.
>>>
>>> I could call this String equality, when toString isn't overridden it
>>> prints
>>> the reference address so this is compatible with identity based equality
>>> also.
>>>
>>> This would fix all those nasty equals implementations for use in
>>> collections
>>> without requiring any work on the developers part.
>>>
>>> Cheers,
>>>
>>> Peter.
>>>
>>>
>>>
>>> Peter wrote:
>>>
>>>>
>>>> SocketPermissionCollection adds SocketPermission at the head of its
>>>> internal list.  This change was made in jdk 1.2.2_005  bug 4301064
>>>> related
>>>> to reverse dns lookup delays for applets.
>>>>
>>>> This indicates that the tail of the last policy file parsed, is added
>>>> last
>>>> to the policy and hence at the head of that List.
>>>>
>>>> It's  also worth noting that the standard policy provider included with
>>>> the jvm is in force until the preferred policy provider is completely
>>>> initiated, after reading all policy files.  So it's likely that the
>>>> standard
>>>> java policy is read last by our policy provider implementation.
>>>>
>>>> In summary a list of SocketPermissions need to be sorted beginning from
>>>> those that cause long dns delays, to wildcard based permissions, so the
>>>> wildcard perms are added last and hence checked first by any implies
>>>> calls.
>>>>
>>>> I've got two options on how to solve this:
>>>>
>>>> 1.  Get rid of PermissionCollection based caches altogether and generate
>>>> PermissionCollection's on demand.
>>>>
>>>> 2  Replace the PermissionCollection cache with a List<Permission> based
>>>> cache, generate Permissioncollection's on demand.  Sort the List after
>>>> creation, before publishing, replace the list on write.
>>>>
>>>> Option 2 could be implemented in ConcurrentPermissions, a replacement
>>>> for
>>>> java.security.Permissions.
>>>>
>>>> Option 1 would be implemented by the policy.
>>>>
>>>> In addition, to allow the security manager to cache the results of
>>>> permission checks for SocketPermission, I can create a wrapper class,
>>>> where
>>>> equals and hashcode are based purely on the string representation.  This
>>>> allows very rapid repeated permission checks.
>>>>
>>>> Looks like I can get around the SocketPermission, CodeSource and URL
>>>> headaches, relatively unscathed.
>>>> N.B. Anyone care to try out, or seriously performance test the new
>>>> PreferredClassProvider?
>>>>
>>>> Cheers,
>>>>
>>>> Peter.
>>>>
>>>> ----- Original message -----
>>>>
>>>>
>>>>>
>>>>> Actually, more significantly for me is that the default localhost
>>>>> SocketPermission is checked before a more lenient SocketPermission. In
>>>>> theory,
>>>>> one should be able to introspect SocketPermission instances and
>>>>> determine
>>>>> that
>>>>> one may be automatically implied by the other so can be skipped,
>>>>> possibly
>>>>> saving
>>>>> a lookup. Chris
>>>>>
>>>>> Peter Firmstone wrote:
>>>>>
>>>>>
>>>>>>
>>>>>> A big problem with the current implementation is SocketPermission
>>>>>> blocks
>>>>>> other permission checks from proceeding.
>>>>>>
>>>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
>

Re: Implications for Security Checks - SocketPermission, URL and DNS lookups

Posted by Peter Firmstone <ji...@zeus.net.au>.
Hmmm, scratches beard, ok, you're right, up for some brainstorming?

   1. If I reimplement SocketPermission, what sort of behaviour do we need?
   2. Or a faster DNS provider? www.xbill.org/dnsjava -
      *sun.net.spi.nameservice.provider.1=dns,dnsjava*

A Comparator is good for ensuring the Permission object's are sorted 
into an efficient order before creating a PermissionCollection.  The 
Comparator isn't much good for a cache that contains a previously 
checked Permission, since equals will be executed (I don't currently 
cache SocketPermission for this reason).  Collection.contains(Permission 
p)?

With a SecurityManager and our own PolicyFile implementation, it is 
possible to replace / substitute SocketPermission, from both ends, but 
both the policy and SM must be in place or it won't work.  PolicyFile 
must be instantiated early, or we risk having static ProtectionDomain's 
that still contain java.net.SocketPermission.  ConcurrentPermissions, a 
replacement for Permissions that ProtectionDomain's use to hold static 
Permissions could also be used to convert any stray SocketPermission 
objects.

One question I've asked myself when creating my own policy 
implementation was CodeSource.implies(CodeSource cs), the implementation 
seemed like a bad idea, it uses DNS, an attacker could use DNS cache 
poisoning to gain elevated permission using an untrusted CodeSource URL, 
simply because the policy thinks the CodeSource is implied.  I changed 
PolicyFile to specifically not use CodeSource.implies().  In reality a 
signer Certificate is required to identify the CodeSource with any level 
of trust.

Now a IPv4 address can be converted to an IPv6, so IP addresses could be 
converted to IPv6 format and compared.  Host names could be compared 
using string comparison.  But without DNS an IP address couldn't equal a 
domain name and a domain name couldn't be resolved to imply an IP address.

The intended purpose of SocketPermission is to check if a user and or 
code is allowed to connect, listen, etc to a network address.  How can 
we trust the DNS to give the right information?

How do we know the DNS has resolved a domain name correctly?

The most logical way to identify the remote end, would be via a 
connection that requires it to authenticate.  We have that now with 
secure JERI.

SocketPermission only makes a decision whether to allow a connection or 
not, that's it.

Without DNS, the policy admin would have to enter SocketPermission 
grants for domain names and IP addresses (manual duplication), so it 
seems DNS is there for convenience.

Using RemotePolicy for a djinn group, we could have an administrator 
node, resolve all current domain names (and reverse lookup IP addresses) 
in the djinn policy file and update all group member nodes with 
duplicated SocketPermission's for IP address and domain name forms.  
Then none of the nodes would need to perform DNS resolution.  Again that 
requires our own SocketPermission implementation.

Cheers,

Peter.



Dan Creswell wrote:
> IMHO, workarounds like this are asking for trouble. You're assuming
> certain rational actions (a decent toString) on behalf of some other
> programmer in the presence of evidence that says they aren't rational
> - i.e. they have a poor equals and/or a poor hashCode implementation.
>
> Combine that with putting the mechanism "under the covers" and I feel
> that's a nasty piece of dark magic brewing that'll give us problems
> later.
>
> An explicit workaround option is supported in typical collections via
> a Comparator, yes, that means others have to write some code but it
> also means the troubles they're facing with equals and hashCode are
> "in your face".
>
> Happy Christmas,
>
> Dan.
>
> On 23 December 2011 02:06, Peter Firmstone <ji...@zeus.net.au> wrote:
>   
>> There's another way around poorly written equals() and hashCode()
>> implementations.
>>
>> In my reference collection utilities, I have strong, weak and soft
>> references, there are variations on these based on identity or equality.
>>
>> Well, I've just thought of another that might help out when poor equals
>> implementations exist:
>>
>> toString()?
>>
>> first check both objects have the same class, then compare the results of
>> toString(), and use toString().hashCode() for hashCode's.
>>
>> I could call this String equality, when toString isn't overridden it prints
>> the reference address so this is compatible with identity based equality
>> also.
>>
>> This would fix all those nasty equals implementations for use in collections
>> without requiring any work on the developers part.
>>
>> Cheers,
>>
>> Peter.
>>
>>
>>
>> Peter wrote:
>>     
>>> SocketPermissionCollection adds SocketPermission at the head of its
>>> internal list.  This change was made in jdk 1.2.2_005  bug 4301064 related
>>> to reverse dns lookup delays for applets.
>>>
>>> This indicates that the tail of the last policy file parsed, is added last
>>> to the policy and hence at the head of that List.
>>>
>>> It's  also worth noting that the standard policy provider included with
>>> the jvm is in force until the preferred policy provider is completely
>>> initiated, after reading all policy files.  So it's likely that the standard
>>> java policy is read last by our policy provider implementation.
>>>
>>> In summary a list of SocketPermissions need to be sorted beginning from
>>> those that cause long dns delays, to wildcard based permissions, so the
>>> wildcard perms are added last and hence checked first by any implies calls.
>>>
>>> I've got two options on how to solve this:
>>>
>>> 1.  Get rid of PermissionCollection based caches altogether and generate
>>> PermissionCollection's on demand.
>>>
>>> 2  Replace the PermissionCollection cache with a List<Permission> based
>>> cache, generate Permissioncollection's on demand.  Sort the List after
>>> creation, before publishing, replace the list on write.
>>>
>>> Option 2 could be implemented in ConcurrentPermissions, a replacement for
>>> java.security.Permissions.
>>>
>>> Option 1 would be implemented by the policy.
>>>
>>> In addition, to allow the security manager to cache the results of
>>> permission checks for SocketPermission, I can create a wrapper class, where
>>> equals and hashcode are based purely on the string representation.  This
>>> allows very rapid repeated permission checks.
>>>
>>> Looks like I can get around the SocketPermission, CodeSource and URL
>>> headaches, relatively unscathed.
>>> N.B. Anyone care to try out, or seriously performance test the new
>>> PreferredClassProvider?
>>>
>>> Cheers,
>>>
>>> Peter.
>>>
>>> ----- Original message -----
>>>
>>>       
>>>> Actually, more significantly for me is that the default localhost
>>>> SocketPermission is checked before a more lenient SocketPermission. In
>>>> theory,
>>>> one should be able to introspect SocketPermission instances and determine
>>>> that
>>>> one may be automatically implied by the other so can be skipped, possibly
>>>> saving
>>>> a lookup. Chris
>>>>
>>>> Peter Firmstone wrote:
>>>>
>>>>         
>>>>> A big problem with the current implementation is SocketPermission blocks
>>>>> other permission checks from proceeding.
>>>>>
>>>>>           
>>>
>>>
>>>       
>>     
>
>   


Re: Implications for Security Checks - SocketPermission, URL and DNS lookups

Posted by Dan Creswell <da...@gmail.com>.
IMHO, workarounds like this are asking for trouble. You're assuming
certain rational actions (a decent toString) on behalf of some other
programmer in the presence of evidence that says they aren't rational
- i.e. they have a poor equals and/or a poor hashCode implementation.

Combine that with putting the mechanism "under the covers" and I feel
that's a nasty piece of dark magic brewing that'll give us problems
later.

An explicit workaround option is supported in typical collections via
a Comparator, yes, that means others have to write some code but it
also means the troubles they're facing with equals and hashCode are
"in your face".

Happy Christmas,

Dan.

On 23 December 2011 02:06, Peter Firmstone <ji...@zeus.net.au> wrote:
> There's another way around poorly written equals() and hashCode()
> implementations.
>
> In my reference collection utilities, I have strong, weak and soft
> references, there are variations on these based on identity or equality.
>
> Well, I've just thought of another that might help out when poor equals
> implementations exist:
>
> toString()?
>
> first check both objects have the same class, then compare the results of
> toString(), and use toString().hashCode() for hashCode's.
>
> I could call this String equality, when toString isn't overridden it prints
> the reference address so this is compatible with identity based equality
> also.
>
> This would fix all those nasty equals implementations for use in collections
> without requiring any work on the developers part.
>
> Cheers,
>
> Peter.
>
>
>
> Peter wrote:
>>
>> SocketPermissionCollection adds SocketPermission at the head of its
>> internal list.  This change was made in jdk 1.2.2_005  bug 4301064 related
>> to reverse dns lookup delays for applets.
>>
>> This indicates that the tail of the last policy file parsed, is added last
>> to the policy and hence at the head of that List.
>>
>> It's  also worth noting that the standard policy provider included with
>> the jvm is in force until the preferred policy provider is completely
>> initiated, after reading all policy files.  So it's likely that the standard
>> java policy is read last by our policy provider implementation.
>>
>> In summary a list of SocketPermissions need to be sorted beginning from
>> those that cause long dns delays, to wildcard based permissions, so the
>> wildcard perms are added last and hence checked first by any implies calls.
>>
>> I've got two options on how to solve this:
>>
>> 1.  Get rid of PermissionCollection based caches altogether and generate
>> PermissionCollection's on demand.
>>
>> 2  Replace the PermissionCollection cache with a List<Permission> based
>> cache, generate Permissioncollection's on demand.  Sort the List after
>> creation, before publishing, replace the list on write.
>>
>> Option 2 could be implemented in ConcurrentPermissions, a replacement for
>> java.security.Permissions.
>>
>> Option 1 would be implemented by the policy.
>>
>> In addition, to allow the security manager to cache the results of
>> permission checks for SocketPermission, I can create a wrapper class, where
>> equals and hashcode are based purely on the string representation.  This
>> allows very rapid repeated permission checks.
>>
>> Looks like I can get around the SocketPermission, CodeSource and URL
>> headaches, relatively unscathed.
>> N.B. Anyone care to try out, or seriously performance test the new
>> PreferredClassProvider?
>>
>> Cheers,
>>
>> Peter.
>>
>> ----- Original message -----
>>
>>>
>>> Actually, more significantly for me is that the default localhost
>>> SocketPermission is checked before a more lenient SocketPermission. In
>>> theory,
>>> one should be able to introspect SocketPermission instances and determine
>>> that
>>> one may be automatically implied by the other so can be skipped, possibly
>>> saving
>>> a lookup. Chris
>>>
>>> Peter Firmstone wrote:
>>>
>>>>
>>>> A big problem with the current implementation is SocketPermission blocks
>>>> other permission checks from proceeding.
>>>>
>>
>>
>>
>>
>
>

Re: Implications for Security Checks - SocketPermission, URL and DNS lookups

Posted by Peter Firmstone <ji...@zeus.net.au>.
There's another way around poorly written equals() and hashCode() 
implementations.

In my reference collection utilities, I have strong, weak and soft 
references, there are variations on these based on identity or equality.

Well, I've just thought of another that might help out when poor equals 
implementations exist:

toString()?

first check both objects have the same class, then compare the results 
of toString(), and use toString().hashCode() for hashCode's.

I could call this String equality, when toString isn't overridden it 
prints the reference address so this is compatible with identity based 
equality also.

This would fix all those nasty equals implementations for use in 
collections without requiring any work on the developers part.

Cheers,

Peter.


Peter wrote:
> SocketPermissionCollection adds SocketPermission at the head of its internal list.  This change was made in jdk 1.2.2_005  bug 4301064 related to reverse dns lookup delays for applets.
>
> This indicates that the tail of the last policy file parsed, is added last to the policy and hence at the head of that List.
>
> It's  also worth noting that the standard policy provider included with the jvm is in force until the preferred policy provider is completely initiated, after reading all policy files.  So it's likely that the standard java policy is read last by our policy provider implementation.
>
> In summary a list of SocketPermissions need to be sorted beginning from those that cause long dns delays, to wildcard based permissions, so the wildcard perms are added last and hence checked first by any implies calls.
>
> I've got two options on how to solve this:
>
> 1.  Get rid of PermissionCollection based caches altogether and generate PermissionCollection's on demand.
>
> 2  Replace the PermissionCollection cache with a List<Permission> based cache, generate Permissioncollection's on demand.  Sort the List after creation, before publishing, replace the list on write.
>
> Option 2 could be implemented in ConcurrentPermissions, a replacement for java.security.Permissions.
>
> Option 1 would be implemented by the policy.
>
> In addition, to allow the security manager to cache the results of permission checks for SocketPermission, I can create a wrapper class, where equals and hashcode are based purely on the string representation.  This allows very rapid repeated permission checks.
>
> Looks like I can get around the SocketPermission, CodeSource and URL headaches, relatively unscathed.  
>
> N.B. Anyone care to try out, or seriously performance test the new PreferredClassProvider?
>
> Cheers,
>
> Peter.
>
> ----- Original message -----
>   
>> Actually, more significantly for me is that the default localhost
>> SocketPermission is checked before a more lenient SocketPermission. In theory,
>> one should be able to introspect SocketPermission instances and determine that
>> one may be automatically implied by the other so can be skipped, possibly saving
>> a lookup. Chris
>>
>> Peter Firmstone wrote:
>>     
>>> A big problem with the current implementation is SocketPermission blocks
>>> other permission checks from proceeding.
>>>       
>
>
>   


Re: Implications for Security Checks - SocketPermission, URL and DNS lookups

Posted by Peter <ji...@zeus.net.au>.
SocketPermissionCollection adds SocketPermission at the head of its internal list.  This change was made in jdk 1.2.2_005  bug 4301064 related to reverse dns lookup delays for applets.

This indicates that the tail of the last policy file parsed, is added last to the policy and hence at the head of that List.

It's  also worth noting that the standard policy provider included with the jvm is in force until the preferred policy provider is completely initiated, after reading all policy files.  So it's likely that the standard java policy is read last by our policy provider implementation.

In summary a list of SocketPermissions need to be sorted beginning from those that cause long dns delays, to wildcard based permissions, so the wildcard perms are added last and hence checked first by any implies calls.

I've got two options on how to solve this:

1.  Get rid of PermissionCollection based caches altogether and generate PermissionCollection's on demand.

2  Replace the PermissionCollection cache with a List<Permission> based cache, generate Permissioncollection's on demand.  Sort the List after creation, before publishing, replace the list on write.

Option 2 could be implemented in ConcurrentPermissions, a replacement for java.security.Permissions.

Option 1 would be implemented by the policy.

In addition, to allow the security manager to cache the results of permission checks for SocketPermission, I can create a wrapper class, where equals and hashcode are based purely on the string representation.  This allows very rapid repeated permission checks.

Looks like I can get around the SocketPermission, CodeSource and URL headaches, relatively unscathed.  

N.B. Anyone care to try out, or seriously performance test the new PreferredClassProvider?

Cheers,

Peter.

----- Original message -----
> Actually, more significantly for me is that the default localhost
> SocketPermission is checked before a more lenient SocketPermission. In theory,
> one should be able to introspect SocketPermission instances and determine that
> one may be automatically implied by the other so can be skipped, possibly saving
> a lookup. Chris
>
> Peter Firmstone wrote:
> > A big problem with the current implementation is SocketPermission blocks
> > other permission checks from proceeding.


Re: Implications for Security Checks - SocketPermission, URL and DNS lookups

Posted by Peter <ji...@zeus.net.au>.
That's exactly what I'm thinking, order SocketPermissions first, implemented using a comparator, add to a new SocketPermissionCollection in order, then perform the security check.

The comparator can perform the introspection to customise the order for every securiity check, eg so that wild cards are checked first, avoiding the dns lookup in most cases.

That way comparators encapsulate the introspection and we can keep the policy implementation simpler.

In my concurrent policy, while localhost is being resolved for a ProtectionDomain, other threads are blocked from performing any SocketPermission checks on that ProtectionDomain, if that PD represents library code shared throughout your app, that too can bring it to a standstill.

Cheers,

Peter.

----- Original message -----
> Actually, more significantly for me is that the default localhost
> SocketPermission is checked before a more lenient SocketPermission. In theory,
> one should be able to introspect SocketPermission instances and determine that
> one may be automatically implied by the other so can be skipped, possibly saving
> a lookup. Chris
>
> Peter Firmstone wrote:
> > A big problem with the current implementation is SocketPermission blocks
> > other permission checks from proceeding.


RE: Implications for Security Checks - SocketPermission, URL and DNS lookups

Posted by Christopher Dolan <ch...@avid.com>.
Actually, more significantly for me is that the default localhost SocketPermission is checked before a more lenient SocketPermission. In theory, one should be able to introspect SocketPermission instances and determine that one may be automatically implied by the other so can be skipped, possibly saving a lookup.
Chris

Peter Firmstone wrote:
> A big problem with the current implementation is SocketPermission blocks 
> other permission checks from proceeding.

Re: Implications for Security Checks - SocketPermission, URL and DNS lookups

Posted by Peter Firmstone <ji...@zeus.net.au>.
Thinking aloud for a moment:

Chris uses a policy to avoid the localhost lookup.

I think if I build the Permissions collection on demand the 
SocketPermission's can be ordered by sorting them prior to being added 
to SocketPermissionCollection using a Comparator<SocketPermission>, 
based on the SocketPermission being checked.

The comparator can behave differently than equals, using the string 
representations of the host and actions to order such that:

Wild card's are added first and SocketPermissions are ordered in their 
most likely order of matching.

This could be a standard feature of the policy, allowing developers to 
provide a custom comparator to order Permission's.

The trick is to avoid the unnecessary DNS lookups, since many are 
performed simply because the order each SocketPermission is checked, eg 
localhost being checked first!

If we can reduce the DNS Lookups, to only those SocketPermission checks 
that would likely fail if reverse DNS is unavailable, without causing 
blocking that delays other permission checks, then we should be able to 
make Reggie much more scalable under these conditions.

All permission checks that can succeed, will, even if only partially for 
an AccessControlContext.   The SocketPermission checks that rely on DNS 
will be the last to complete, but since all other permissions can 
complete (even if belonging to the same thread context), the backlog 
will be much smaller.

A big problem with the current implementation is SocketPermission blocks 
other permission checks from proceeding.

Cheers,

Peter.


Christopher Dolan wrote:
> Quite true Gregg, but that doesn't help when Reggie boots and hundreds of hosts contact it in a short time span against a cold DNS cache. Prior to resolution of RIVER-396 ("PreferredClassProvider classloader cache concurrency improvement") these timeout failures were effectively serial and caused long stalls. The resulting OOMEs and failed thread creation events in some isolated scenarios were unrecoverable. For me, this was mitigated by the triple solution of 1) turning off the SocketPermission check, 2) the RIVER-396 patch and 3) switching JERI to NIO to save some threads.
>
> Chris
>
> -----Original Message-----
> From: Gregg Wonderly [mailto:gregg@wonderly.org] 
> Sent: Tuesday, December 13, 2011 8:19 AM
> To: dev@river.apache.org
> Cc: Peter Firmstone
> Subject: Re: Implications for Security Checks - SocketPermission, URL and DNS lookups
>
> Remember to, from a general "workaround" perspective, that you can use command 
> line options to "lengthen" the time that DNS failure information is retained, to 
> keep things moving when no reverse DNS information is available.  The default, 
> is like 10 seconds, and that is considerably shorter than what you will 
> generally experience in a failed lookup.  The end result, is that the failure 
> cache doesn't serve much purpose without it having a very extended time, as a 
> workaround.   In some cases, I've set it to an hour or more, and some initial 
> startup is then "slow", and initial client "connection" can be a little slow, 
> but then things move along quite well.
>
> Gregg Wonderly
>
> On 12/13/2011 2:56 AM, Peter Firmstone wrote:
>   
>> In addition CodeSource.implies() also causes DNS checks, I'm not 100% sure 
>> about the jvm code, but Harmony code uses SocketPermission.implies() to check 
>> if one CodeSource implies another, I believe the jvm policy implementation 
>> also utilises it, because harmony's implementation is built from Sun's java spec.
>>
>> So in the existing policy implementations, when parsing the policy files, 
>> additional start up delays may be caused by the CodeSource.implies() method 
>> making network DNS calls.
>>
>> In my ConcurrentPolicyFile implementation (to replace the standard java 
>> PolicyFile implementation), I've created a URIGrant, I've taken code from 
>> Harmony to implement implies(ProtectionDomain pd), that performs wildcard 
>> matching compliant with CodeSource.implies, the only difference being, that no 
>> attempt to resolve URI's is made.
>>
>> Typically most policy files specify file based URL's for CodeSource, however 
>> in a network application where many CodeSources may be network URL's, DNS 
>> lookup causes added delays.
>>
>> I've also created a CodeSourceGrant which uses CodeSource.implies() for 
>> backward compatibility with existing java policy files, however I'm sure that 
>> most will simply want to revise their policy files.
>>
>> The standard interface PermissionGrant, is implemented by the following 
>> inheritance hierarchy of immutable classes:
>>
>>                                  PrincipalGrant
>>                  ______________|_______________________________
>>                 
>> |                                                                                           
>> |
>> ProtectionDomainGrant                                                         
>> CertificateGrant
>>                 |                                                         
>> ________________ |________________
>> ClassLoaderGrant                                              
>> |                                                                  |
>>                                                                   
>> URIGrant                                              CodeSourceGrant
>>
>>
>> Only PrincipalGrant is publicly visible, a builder returns the correct 
>> implementation.
>>
>> ProtectionDomainGrant and ClassLoaderGrant are dynamically granted, by the 
>> completely new DynamicPolicyProvider (which has long since passed all tests).
>>
>> CertificateGrant, URIGrant and CodeSourceGrant are used by the File based 
>> policy's and RemotePolicy, which is intended to be a service that nodes in a 
>> djinn can use to allow an administrator to update the policy (eg to include 
>> new certificates or principals), with all the protection of subject 
>> authentication and secure connections.  RemotePolicy is idempotent, the policy 
>> is updated in one operation, so the current policy state is always known to 
>> the administrator (who is a client).
>>
>> Since a File based policy is mostly read and only written when refreshed, 
>> PermissionGrant's are held in a volatile array reference, copied (only the 
>> reference) by any code that reads the array.  The array reference is updated 
>> when the policy is updated, the array is never mutated after publishing.
>>
>> A ConcurrentMap<ProtectionDomain, PermissionCollection> (with weak keys) acts 
>> as a cache, I've got ConcurrentPermissions, an implementation that replaces 
>> the hetergenous java.security.Permissions class, this also resolves any 
>> unresolved permissions.
>>
>> However I'm starting to wonder if it's wiser to throw away the cache 
>> altogether and simply build java.security.Permissions on demand, then throw 
>> Permissions away immediately after use for collection in the young generation 
>> heap (it's likely to fit in level 2 cache and never even be copied to Ram).  
>> This would eliminate contention between existing PermissionCollection's that 
>> block, like SocketPermissionCollection.
>>
>> So if you have for instance 100 different AccessControlContext's being checked 
>> by different threads, that all contain the same ProtectionDomain's for a 
>> SocketPermission, then all will be executed in parallel.  Currently due to 
>> blocking, each SocketPermission that performs a DNS check must either resolve 
>> or timeout, before it's SocketPermissionCollection can release it's 
>> synchronization lock (and there may be multiple SocketPermission's in a 
>> SocketPermissionCollection), before another thread can check it's context and 
>> so on, which explains everything coming to a standstill.
>>
>> If all permission checks execute in parallel independently, without blocking, 
>> then the timeout won't be magnified.
>>
>> I am considering going one step further and replacing SocketPermission and 
>> SocketPermissionCollection, and implementing DNS checks in the 
>> SocketPermissionCollection rather than SocketPermission.  By doing this a 
>> matching record will be found in most cases without requiring DNS reverse 
>> lookup.  If I keep this as an internal policy implementation detail, then if 
>> Oracle fixes SocketPermission, we can return to using the standard java 
>> implementation, in fact I could make it a configuration property.
>>
>> It's an unfortunate fact that not all permission checks are performed in the 
>> policy, replacing SocketPermission also requires the cooperation of the 
>> SecurityManager.  To make matters worse, static ProtectionDomains created 
>> prior to my policy implementation being constructed will never consult my 
>> policy implementation as such they will still contain SocketPermission.   So 
>> the SecurityManager would need to check each ProtectionDomain for both 
>> implementations, so reimplementing SocketPermission doesn't eliminate its use 
>> entirely.
>>
>> It's worth noting that SocketPermission is implemented rather poorly and the 
>> same functionality can be provided with far fewer DNS lookups being performed, 
>> since the majority are performed completely unnecessarily.  Perhaps it's worth 
>> me donating some time to OpenJDK to fix it, I'd have to check with Apache 
>> legal first I suppose.
>>
>> The problems with DNS lookup also affects CodeSource and URL equals and 
>> hashcode methods, so these classes shouldn't be used in collections.
>>
>> Cheers,
>>
>> Peter.
>>
>> Christopher Dolan wrote:
>>     
>>> To simulate the problem, go to InetAddress.getHostFromNameService() in your 
>>> IDE, set a breakpoint on the "nameService.getHostByAddr" line with a 
>>> condition of something like this:
>>>
>>>      new java.util.concurrent.CountDownLatch(1).await(15, 
>>> java.util.concurrent.TimeUnit.SECONDS)
>>>
>>> then launch your River application from within the IDE. This will cause all 
>>> reverse DNS lookups to stall for 15 seconds before succeeding. This will 
>>> affect Reggie the worst because it has to verify so many hostnames. In a 
>>> large group (a few thousand services) this will drive Reggie's thread count 
>>> skyward, perhaps triggering OutOfMemory errors if it's in a 32-bit JVM.
>>>
>>> This problem happens in the real world in facilities that allow client 
>>> connections to the production LAN, but do not allow the production LAN to 
>>> resolve hosts in the client LAN. This may occur due to separate IT teams or 
>>> strict security rules or simple configuration errors. Because most 
>>> client-server systems, like web servers, do not require the server to contact 
>>> the client this problem does not become immediately visible to IT. Instead, 
>>> the question is inevitably "Why is Jini/River so sensitive to reverse DNS? 
>>> All of my other services work fine."
>>>
>>> Chris
>>>
>>> -----Original Message-----
>>> From: Tom Hobbs [mailto:tvhobbs@googlemail.com] Sent: Monday, December 12, 
>>> 2011 1:43 PM
>>> To: dev@river.apache.org
>>> Subject: Re: RE: Implications for Security Checks - SocketPermission, URL and 
>>> DNS lookups
>>>
>>> My biggest concern with such fundamental changes is controlling the impact
>>> it will have.  I'm a pretty good example of this, I haven't experienced the
>>> troubles these changes are intended to overcome.  I also don't havent made
>>> any attempt to dive into these areas of the code, for any reason.
>>>
>>> Is it possible to put together a test case which exposes these problems and
>>> also proves the solution?
>>>
>>> Obviously, a test case involving misconfigured networks is daft, in that
>>> instance a handy "if your network misconfigured" diagnostic tool or
>>> documentation would be a good idea.
>>>
>>> Please don't interpret this concern as a criticism of your work, Peter.
>>> Far from it.  It's just a comment born out of not really having any contact
>>> with the area your working in!
>>>
>>>
>>> Grammar and spelling have been sacrificed on the altar of messaging via
>>> mobile device.
>>>
>>> On 12 Dec 2011 18:01, "Christopher Dolan" <ch...@avid.com>
>>> wrote:
>>>
>>>       
>>>> Specifically for SocketPermission, I experienced severe timeout problems
>>>> with reverse DNS misconfigurations. For some LAN-based deployments, I
>>>> relaxed this criterion via 'new SocketPermission("*",
>>>> "accept,listen,connect,resolve")'. This was difficult to apply to a general
>>>> Sun/Oracle JVM, however, because the default security policy *prepends* a
>>>> ("localhost:1024-","listen") permission that triggers the reverse DNS
>>>> lookup. To avoid this inconvenient setting, I install a new
>>>> java.security.Policy subclass that delegates to the default Policy except
>>>> when the incoming permission is a SocketPermission. That way I don't need
>>>> to modify the policy file in the JVM. The Policy.implies() override method
>>>> is trivial because it just needs to do " if (permission instanceof
>>>> SocketPermission) { ... }". The PermissionCollection methods were trickier
>>>> to override (skip over any SocketPermission elements in the default
>>>> Policy's PermissionCollection), but still only about 50 LOC.
>>>>
>>>> Chris
>>>>
>>>> -----Original Message-----
>>>> From: Peter Firmstone [mailto:jini@zeus.net.au]
>>>> Sent: Friday, December 09, 2011 9:28 PM
>>>> To: dev@river.apache.org
>>>> Subject: Implications for Security Checks - SocketPermission, URL and DNS
>>>> lookups
>>>>
>>>> DNS lookups and reverse lookups caused by URL and SocketPermission,
>>>> equals, hashCode and implies methods create some serious performance
>>>> problems for distributed programs.
>>>>
>>>> The concurrent policy implementation I've been working on reduces lock
>>>> contention between threads performing security checks.
>>>>
>>>> When the SecurityManager is used to check a guard, it calls the
>>>> AccessController, which retrieves the AccessControlContext from the call
>>>> stack, this contains all the ProtectionDomain's on the call stack (I
>>>> won't go into privileged calls here), if a ProtectionDomain is dynamic
>>>> it will consult the Policy, prior to checking the static permissions it
>>>> contains.
>>>>
>>>> The problem with the old policy implementation is lock contention caused
>>>> by multiple threads all using multiple ProtectionDomains, when the time
>>>> taken to perform a check is considerable, especially where identical
>>>> security checks might be performed by multiple threads executing the
>>>> same code.
>>>>
>>>> Although concurrent policy reduces contention between ProtectionDomain's
>>>> calls to Policy.implies, there remain some fundamental problems with the
>>>> implementations of SocketPermission and URL, that cause unnecessary DNS
>>>> lookups during equals(), hashCode() and implies() methods.
>>>>
>>>> The following bugs concern SocketPermission (please read before
>>>> continuing) :
>>>>
>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6592285
>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4975882 - contains a
>>>> lot of valuable comments.
>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4671007 - fixed,
>>>> perhaps incorrectly.
>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6501746
>>>>
>>>> Anyway to cut a long story short, DNS lookups and DNS reverse lookups
>>>> are performed for the equals and hashCode implementations in
>>>> SocketPermission and URL, with disastrous performance implications for
>>>> policy implementations using collections and caching security permission
>>>> check results.
>>>>
>>>> For example, once a SocketPermission guard has been checked for a
>>>> specific AccessContolContext the result is cached by my SecurityManager,
>>>> avoiding repeat security checks, however if that cache contains
>>>> SocketPermission, DNS lookups will be required, the cache will perform
>>>> slower than some other directly performed security checks!  The cache is
>>>> intended to return quickly to avoid reconsulting every ProtectionDomain
>>>> on the stack.
>>>>
>>>> To make matters worse, when checking a SocketPermission guard, the DNS
>>>> may be consulted for every non wild card SocketPermission contained
>>>> within a SocketPermissionCollection, up until it is implied.  DNS checks
>>>> are being made unnecessarily, since the wild card that matches may not
>>>> require a DNS lookup at all, but because the non matching
>>>> SocketPermission's are being checked first, the DNS lookups and reverse
>>>> lookups are still performed.  This could be fixed completely, by moving
>>>> the responsibility of DNS lookups from SocketPermission to
>>>> SocketPermissionCollection.
>>>>
>>>> The identity of two SocketPermission's are equal if they resolve to the
>>>> same IP address, but their hashCode's are different! See bug 6592623.
>>>>
>>>> The identity of a SocketPermission with an IP address and a DNS name,
>>>> resolving to identical IP address should not (in my opinion) be equal,
>>>> but is!  One SocketPermission should only imply the other while DNS
>>>> resolves to the same IP address, otherwise the equality of the two
>>>> SocketPermission's will change if the IP address is assigned to a
>>>> different domain!  Object equality / identity shouldn't depend on the
>>>> result of a possibly unreliable network source.
>>>>
>>>> SocketPermission and SocketPermissionCollection are broken, the only
>>>> solution I can think of is to re-implement these classes (from Harmony)
>>>> in the policy and SecurityManager, substituting the existing jvm
>>>> classes.  This would not be visible to client developers.
>>>>
>>>> SocketPermission's may also exist in a ProtectionDomain's static
>>>> Permissions, these would have to be converted by the policy when merging
>>>> the permissions from the ProtectionDomain with those from the policy.
>>>> Since ProtectionDomain, attempts to check it's own internal permissions,
>>>> after the policy permission check fails, DNS checks are currently
>>>> performed by duplicate SocketPermission's residing in the
>>>> ProectionDomain, this will no longer occur, since the permission being
>>>> checked will be converted to say for argument sake
>>>> org.apache.river.security.SocketPermission.  However because some
>>>> ProtectionDomains are static, they never consult the policy, so the
>>>> Permission's contained in each ProtectionDomain will require conversion
>>>> also, to do so will require extending and implementing a
>>>> ProtectionDomain that encapsulates existing ProtectionDomain's in the
>>>> AccessControlContext, by utilising a DomainCombiner.
>>>>
>>>> For CodeSource grant's, the policy file based grant's are defined by
>>>> URL's, however URL's identity depend upon DNS record results, similar to
>>>> SocketPermission equals and hashCode implementations which we have no
>>>> control over.
>>>>
>>>> I'm thinking about implementing URI based grant's instead, to avoid DNS
>>>> lookups, then allowing a policy compatibility mode to be enabled (with
>>>> logging) for falling back to CodeSource grant's when a URL cannot be
>>>> converted to a URI, this is a much simpler fix than the SocketPermission
>>>> problem.
>>>>
>>>> For Dynamic Policy Grants, because ProtectionDomain doesn't override
>>>> equals (that's a good thing), the contained CodeSource must also be
>>>> checked, again potentially slowing down permission checks with DNS
>>>> lookups, simply because CodeSource uses URL's.  Changing the Dynamic
>>>> Grant's to use URI based comparison would be relatively simple, since
>>>> the URI is obtained dynamically when the dynamic grant is created.
>>>>
>>>> URI based grant's don't use DNS resolution and would have a narrower
>>>> scope of implied CodeSources, an IP based grant won't imply a DNS domain
>>>> URL based CodeSource and vice versa.  Rather than rely on DNS
>>>> resolution, grant's could be made specifically for IPv4, IPv6 and DNS
>>>> names in policy files.  URL.toURI() can be utilised to check if URI
>>>> grant's imply a CodeSource without resorting to DNS.
>>>>
>>>> Any thoughts, comments or ideas?
>>>>
>>>> N.B. It's sad that security is implemented the way it is, it would be
>>>> far better if it was Executor based, since every protection domain could
>>>> be checked in parallel, rather than in sequence.
>>>>
>>>> Regards,
>>>>
>>>> Peter.
>>>>
>>>>
>>>>
>>>>         
>>     
>
>
>   


Re: Implications for Security Checks - SocketPermission, URL and DNS lookups

Posted by Gregg Wonderly <gr...@wonderly.org>.
Yeah Chris, you are right, there is an upper limit on how much you can gain with 
such a work around.  Scaling up to hundreds if not thousands of services, is 
something that not many people seem to have experience with.   Just my first 
experience with ~10 machines with ~15 services each on them, was quite an eye 
opener on how long resolving codebases and downloading took for generic 
"serviceUI" lookups.

Gregg

On 12/13/2011 8:31 AM, Christopher Dolan wrote:
> Quite true Gregg, but that doesn't help when Reggie boots and hundreds of hosts contact it in a short time span against a cold DNS cache. Prior to resolution of RIVER-396 ("PreferredClassProvider classloader cache concurrency improvement") these timeout failures were effectively serial and caused long stalls. The resulting OOMEs and failed thread creation events in some isolated scenarios were unrecoverable. For me, this was mitigated by the triple solution of 1) turning off the SocketPermission check, 2) the RIVER-396 patch and 3) switching JERI to NIO to save some threads.
>
> Chris
>
> -----Original Message-----
> From: Gregg Wonderly [mailto:gregg@wonderly.org]
> Sent: Tuesday, December 13, 2011 8:19 AM
> To: dev@river.apache.org
> Cc: Peter Firmstone
> Subject: Re: Implications for Security Checks - SocketPermission, URL and DNS lookups
>
> Remember to, from a general "workaround" perspective, that you can use command
> line options to "lengthen" the time that DNS failure information is retained, to
> keep things moving when no reverse DNS information is available.  The default,
> is like 10 seconds, and that is considerably shorter than what you will
> generally experience in a failed lookup.  The end result, is that the failure
> cache doesn't serve much purpose without it having a very extended time, as a
> workaround.   In some cases, I've set it to an hour or more, and some initial
> startup is then "slow", and initial client "connection" can be a little slow,
> but then things move along quite well.
>
> Gregg Wonderly
>
> On 12/13/2011 2:56 AM, Peter Firmstone wrote:
>> In addition CodeSource.implies() also causes DNS checks, I'm not 100% sure
>> about the jvm code, but Harmony code uses SocketPermission.implies() to check
>> if one CodeSource implies another, I believe the jvm policy implementation
>> also utilises it, because harmony's implementation is built from Sun's java spec.
>>
>> So in the existing policy implementations, when parsing the policy files,
>> additional start up delays may be caused by the CodeSource.implies() method
>> making network DNS calls.
>>
>> In my ConcurrentPolicyFile implementation (to replace the standard java
>> PolicyFile implementation), I've created a URIGrant, I've taken code from
>> Harmony to implement implies(ProtectionDomain pd), that performs wildcard
>> matching compliant with CodeSource.implies, the only difference being, that no
>> attempt to resolve URI's is made.
>>
>> Typically most policy files specify file based URL's for CodeSource, however
>> in a network application where many CodeSources may be network URL's, DNS
>> lookup causes added delays.
>>
>> I've also created a CodeSourceGrant which uses CodeSource.implies() for
>> backward compatibility with existing java policy files, however I'm sure that
>> most will simply want to revise their policy files.
>>
>> The standard interface PermissionGrant, is implemented by the following
>> inheritance hierarchy of immutable classes:
>>
>>                                   PrincipalGrant
>>                   ______________|_______________________________
>>
>> |
>> |
>> ProtectionDomainGrant
>> CertificateGrant
>>                  |
>> ________________ |________________
>> ClassLoaderGrant
>> |                                                                  |
>>
>> URIGrant                                              CodeSourceGrant
>>
>>
>> Only PrincipalGrant is publicly visible, a builder returns the correct
>> implementation.
>>
>> ProtectionDomainGrant and ClassLoaderGrant are dynamically granted, by the
>> completely new DynamicPolicyProvider (which has long since passed all tests).
>>
>> CertificateGrant, URIGrant and CodeSourceGrant are used by the File based
>> policy's and RemotePolicy, which is intended to be a service that nodes in a
>> djinn can use to allow an administrator to update the policy (eg to include
>> new certificates or principals), with all the protection of subject
>> authentication and secure connections.  RemotePolicy is idempotent, the policy
>> is updated in one operation, so the current policy state is always known to
>> the administrator (who is a client).
>>
>> Since a File based policy is mostly read and only written when refreshed,
>> PermissionGrant's are held in a volatile array reference, copied (only the
>> reference) by any code that reads the array.  The array reference is updated
>> when the policy is updated, the array is never mutated after publishing.
>>
>> A ConcurrentMap<ProtectionDomain, PermissionCollection>  (with weak keys) acts
>> as a cache, I've got ConcurrentPermissions, an implementation that replaces
>> the hetergenous java.security.Permissions class, this also resolves any
>> unresolved permissions.
>>
>> However I'm starting to wonder if it's wiser to throw away the cache
>> altogether and simply build java.security.Permissions on demand, then throw
>> Permissions away immediately after use for collection in the young generation
>> heap (it's likely to fit in level 2 cache and never even be copied to Ram).
>> This would eliminate contention between existing PermissionCollection's that
>> block, like SocketPermissionCollection.
>>
>> So if you have for instance 100 different AccessControlContext's being checked
>> by different threads, that all contain the same ProtectionDomain's for a
>> SocketPermission, then all will be executed in parallel.  Currently due to
>> blocking, each SocketPermission that performs a DNS check must either resolve
>> or timeout, before it's SocketPermissionCollection can release it's
>> synchronization lock (and there may be multiple SocketPermission's in a
>> SocketPermissionCollection), before another thread can check it's context and
>> so on, which explains everything coming to a standstill.
>>
>> If all permission checks execute in parallel independently, without blocking,
>> then the timeout won't be magnified.
>>
>> I am considering going one step further and replacing SocketPermission and
>> SocketPermissionCollection, and implementing DNS checks in the
>> SocketPermissionCollection rather than SocketPermission.  By doing this a
>> matching record will be found in most cases without requiring DNS reverse
>> lookup.  If I keep this as an internal policy implementation detail, then if
>> Oracle fixes SocketPermission, we can return to using the standard java
>> implementation, in fact I could make it a configuration property.
>>
>> It's an unfortunate fact that not all permission checks are performed in the
>> policy, replacing SocketPermission also requires the cooperation of the
>> SecurityManager.  To make matters worse, static ProtectionDomains created
>> prior to my policy implementation being constructed will never consult my
>> policy implementation as such they will still contain SocketPermission.   So
>> the SecurityManager would need to check each ProtectionDomain for both
>> implementations, so reimplementing SocketPermission doesn't eliminate its use
>> entirely.
>>
>> It's worth noting that SocketPermission is implemented rather poorly and the
>> same functionality can be provided with far fewer DNS lookups being performed,
>> since the majority are performed completely unnecessarily.  Perhaps it's worth
>> me donating some time to OpenJDK to fix it, I'd have to check with Apache
>> legal first I suppose.
>>
>> The problems with DNS lookup also affects CodeSource and URL equals and
>> hashcode methods, so these classes shouldn't be used in collections.
>>
>> Cheers,
>>
>> Peter.
>>
>> Christopher Dolan wrote:
>>> To simulate the problem, go to InetAddress.getHostFromNameService() in your
>>> IDE, set a breakpoint on the "nameService.getHostByAddr" line with a
>>> condition of something like this:
>>>
>>>       new java.util.concurrent.CountDownLatch(1).await(15,
>>> java.util.concurrent.TimeUnit.SECONDS)
>>>
>>> then launch your River application from within the IDE. This will cause all
>>> reverse DNS lookups to stall for 15 seconds before succeeding. This will
>>> affect Reggie the worst because it has to verify so many hostnames. In a
>>> large group (a few thousand services) this will drive Reggie's thread count
>>> skyward, perhaps triggering OutOfMemory errors if it's in a 32-bit JVM.
>>>
>>> This problem happens in the real world in facilities that allow client
>>> connections to the production LAN, but do not allow the production LAN to
>>> resolve hosts in the client LAN. This may occur due to separate IT teams or
>>> strict security rules or simple configuration errors. Because most
>>> client-server systems, like web servers, do not require the server to contact
>>> the client this problem does not become immediately visible to IT. Instead,
>>> the question is inevitably "Why is Jini/River so sensitive to reverse DNS?
>>> All of my other services work fine."
>>>
>>> Chris
>>>
>>> -----Original Message-----
>>> From: Tom Hobbs [mailto:tvhobbs@googlemail.com] Sent: Monday, December 12,
>>> 2011 1:43 PM
>>> To: dev@river.apache.org
>>> Subject: Re: RE: Implications for Security Checks - SocketPermission, URL and
>>> DNS lookups
>>>
>>> My biggest concern with such fundamental changes is controlling the impact
>>> it will have.  I'm a pretty good example of this, I haven't experienced the
>>> troubles these changes are intended to overcome.  I also don't havent made
>>> any attempt to dive into these areas of the code, for any reason.
>>>
>>> Is it possible to put together a test case which exposes these problems and
>>> also proves the solution?
>>>
>>> Obviously, a test case involving misconfigured networks is daft, in that
>>> instance a handy "if your network misconfigured" diagnostic tool or
>>> documentation would be a good idea.
>>>
>>> Please don't interpret this concern as a criticism of your work, Peter.
>>> Far from it.  It's just a comment born out of not really having any contact
>>> with the area your working in!
>>>
>>>
>>> Grammar and spelling have been sacrificed on the altar of messaging via
>>> mobile device.
>>>
>>> On 12 Dec 2011 18:01, "Christopher Dolan"<ch...@avid.com>
>>> wrote:
>>>
>>>> Specifically for SocketPermission, I experienced severe timeout problems
>>>> with reverse DNS misconfigurations. For some LAN-based deployments, I
>>>> relaxed this criterion via 'new SocketPermission("*",
>>>> "accept,listen,connect,resolve")'. This was difficult to apply to a general
>>>> Sun/Oracle JVM, however, because the default security policy *prepends* a
>>>> ("localhost:1024-","listen") permission that triggers the reverse DNS
>>>> lookup. To avoid this inconvenient setting, I install a new
>>>> java.security.Policy subclass that delegates to the default Policy except
>>>> when the incoming permission is a SocketPermission. That way I don't need
>>>> to modify the policy file in the JVM. The Policy.implies() override method
>>>> is trivial because it just needs to do " if (permission instanceof
>>>> SocketPermission) { ... }". The PermissionCollection methods were trickier
>>>> to override (skip over any SocketPermission elements in the default
>>>> Policy's PermissionCollection), but still only about 50 LOC.
>>>>
>>>> Chris
>>>>
>>>> -----Original Message-----
>>>> From: Peter Firmstone [mailto:jini@zeus.net.au]
>>>> Sent: Friday, December 09, 2011 9:28 PM
>>>> To: dev@river.apache.org
>>>> Subject: Implications for Security Checks - SocketPermission, URL and DNS
>>>> lookups
>>>>
>>>> DNS lookups and reverse lookups caused by URL and SocketPermission,
>>>> equals, hashCode and implies methods create some serious performance
>>>> problems for distributed programs.
>>>>
>>>> The concurrent policy implementation I've been working on reduces lock
>>>> contention between threads performing security checks.
>>>>
>>>> When the SecurityManager is used to check a guard, it calls the
>>>> AccessController, which retrieves the AccessControlContext from the call
>>>> stack, this contains all the ProtectionDomain's on the call stack (I
>>>> won't go into privileged calls here), if a ProtectionDomain is dynamic
>>>> it will consult the Policy, prior to checking the static permissions it
>>>> contains.
>>>>
>>>> The problem with the old policy implementation is lock contention caused
>>>> by multiple threads all using multiple ProtectionDomains, when the time
>>>> taken to perform a check is considerable, especially where identical
>>>> security checks might be performed by multiple threads executing the
>>>> same code.
>>>>
>>>> Although concurrent policy reduces contention between ProtectionDomain's
>>>> calls to Policy.implies, there remain some fundamental problems with the
>>>> implementations of SocketPermission and URL, that cause unnecessary DNS
>>>> lookups during equals(), hashCode() and implies() methods.
>>>>
>>>> The following bugs concern SocketPermission (please read before
>>>> continuing) :
>>>>
>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6592285
>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4975882 - contains a
>>>> lot of valuable comments.
>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4671007 - fixed,
>>>> perhaps incorrectly.
>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6501746
>>>>
>>>> Anyway to cut a long story short, DNS lookups and DNS reverse lookups
>>>> are performed for the equals and hashCode implementations in
>>>> SocketPermission and URL, with disastrous performance implications for
>>>> policy implementations using collections and caching security permission
>>>> check results.
>>>>
>>>> For example, once a SocketPermission guard has been checked for a
>>>> specific AccessContolContext the result is cached by my SecurityManager,
>>>> avoiding repeat security checks, however if that cache contains
>>>> SocketPermission, DNS lookups will be required, the cache will perform
>>>> slower than some other directly performed security checks!  The cache is
>>>> intended to return quickly to avoid reconsulting every ProtectionDomain
>>>> on the stack.
>>>>
>>>> To make matters worse, when checking a SocketPermission guard, the DNS
>>>> may be consulted for every non wild card SocketPermission contained
>>>> within a SocketPermissionCollection, up until it is implied.  DNS checks
>>>> are being made unnecessarily, since the wild card that matches may not
>>>> require a DNS lookup at all, but because the non matching
>>>> SocketPermission's are being checked first, the DNS lookups and reverse
>>>> lookups are still performed.  This could be fixed completely, by moving
>>>> the responsibility of DNS lookups from SocketPermission to
>>>> SocketPermissionCollection.
>>>>
>>>> The identity of two SocketPermission's are equal if they resolve to the
>>>> same IP address, but their hashCode's are different! See bug 6592623.
>>>>
>>>> The identity of a SocketPermission with an IP address and a DNS name,
>>>> resolving to identical IP address should not (in my opinion) be equal,
>>>> but is!  One SocketPermission should only imply the other while DNS
>>>> resolves to the same IP address, otherwise the equality of the two
>>>> SocketPermission's will change if the IP address is assigned to a
>>>> different domain!  Object equality / identity shouldn't depend on the
>>>> result of a possibly unreliable network source.
>>>>
>>>> SocketPermission and SocketPermissionCollection are broken, the only
>>>> solution I can think of is to re-implement these classes (from Harmony)
>>>> in the policy and SecurityManager, substituting the existing jvm
>>>> classes.  This would not be visible to client developers.
>>>>
>>>> SocketPermission's may also exist in a ProtectionDomain's static
>>>> Permissions, these would have to be converted by the policy when merging
>>>> the permissions from the ProtectionDomain with those from the policy.
>>>> Since ProtectionDomain, attempts to check it's own internal permissions,
>>>> after the policy permission check fails, DNS checks are currently
>>>> performed by duplicate SocketPermission's residing in the
>>>> ProectionDomain, this will no longer occur, since the permission being
>>>> checked will be converted to say for argument sake
>>>> org.apache.river.security.SocketPermission.  However because some
>>>> ProtectionDomains are static, they never consult the policy, so the
>>>> Permission's contained in each ProtectionDomain will require conversion
>>>> also, to do so will require extending and implementing a
>>>> ProtectionDomain that encapsulates existing ProtectionDomain's in the
>>>> AccessControlContext, by utilising a DomainCombiner.
>>>>
>>>> For CodeSource grant's, the policy file based grant's are defined by
>>>> URL's, however URL's identity depend upon DNS record results, similar to
>>>> SocketPermission equals and hashCode implementations which we have no
>>>> control over.
>>>>
>>>> I'm thinking about implementing URI based grant's instead, to avoid DNS
>>>> lookups, then allowing a policy compatibility mode to be enabled (with
>>>> logging) for falling back to CodeSource grant's when a URL cannot be
>>>> converted to a URI, this is a much simpler fix than the SocketPermission
>>>> problem.
>>>>
>>>> For Dynamic Policy Grants, because ProtectionDomain doesn't override
>>>> equals (that's a good thing), the contained CodeSource must also be
>>>> checked, again potentially slowing down permission checks with DNS
>>>> lookups, simply because CodeSource uses URL's.  Changing the Dynamic
>>>> Grant's to use URI based comparison would be relatively simple, since
>>>> the URI is obtained dynamically when the dynamic grant is created.
>>>>
>>>> URI based grant's don't use DNS resolution and would have a narrower
>>>> scope of implied CodeSources, an IP based grant won't imply a DNS domain
>>>> URL based CodeSource and vice versa.  Rather than rely on DNS
>>>> resolution, grant's could be made specifically for IPv4, IPv6 and DNS
>>>> names in policy files.  URL.toURI() can be utilised to check if URI
>>>> grant's imply a CodeSource without resorting to DNS.
>>>>
>>>> Any thoughts, comments or ideas?
>>>>
>>>> N.B. It's sad that security is implemented the way it is, it would be
>>>> far better if it was Executor based, since every protection domain could
>>>> be checked in parallel, rather than in sequence.
>>>>
>>>> Regards,
>>>>
>>>> Peter.
>>>>
>>>>
>>>>
>>
>


RE: Implications for Security Checks - SocketPermission, URL and DNS lookups

Posted by Christopher Dolan <ch...@avid.com>.
Quite true Gregg, but that doesn't help when Reggie boots and hundreds of hosts contact it in a short time span against a cold DNS cache. Prior to resolution of RIVER-396 ("PreferredClassProvider classloader cache concurrency improvement") these timeout failures were effectively serial and caused long stalls. The resulting OOMEs and failed thread creation events in some isolated scenarios were unrecoverable. For me, this was mitigated by the triple solution of 1) turning off the SocketPermission check, 2) the RIVER-396 patch and 3) switching JERI to NIO to save some threads.

Chris

-----Original Message-----
From: Gregg Wonderly [mailto:gregg@wonderly.org] 
Sent: Tuesday, December 13, 2011 8:19 AM
To: dev@river.apache.org
Cc: Peter Firmstone
Subject: Re: Implications for Security Checks - SocketPermission, URL and DNS lookups

Remember to, from a general "workaround" perspective, that you can use command 
line options to "lengthen" the time that DNS failure information is retained, to 
keep things moving when no reverse DNS information is available.  The default, 
is like 10 seconds, and that is considerably shorter than what you will 
generally experience in a failed lookup.  The end result, is that the failure 
cache doesn't serve much purpose without it having a very extended time, as a 
workaround.   In some cases, I've set it to an hour or more, and some initial 
startup is then "slow", and initial client "connection" can be a little slow, 
but then things move along quite well.

Gregg Wonderly

On 12/13/2011 2:56 AM, Peter Firmstone wrote:
> In addition CodeSource.implies() also causes DNS checks, I'm not 100% sure 
> about the jvm code, but Harmony code uses SocketPermission.implies() to check 
> if one CodeSource implies another, I believe the jvm policy implementation 
> also utilises it, because harmony's implementation is built from Sun's java spec.
>
> So in the existing policy implementations, when parsing the policy files, 
> additional start up delays may be caused by the CodeSource.implies() method 
> making network DNS calls.
>
> In my ConcurrentPolicyFile implementation (to replace the standard java 
> PolicyFile implementation), I've created a URIGrant, I've taken code from 
> Harmony to implement implies(ProtectionDomain pd), that performs wildcard 
> matching compliant with CodeSource.implies, the only difference being, that no 
> attempt to resolve URI's is made.
>
> Typically most policy files specify file based URL's for CodeSource, however 
> in a network application where many CodeSources may be network URL's, DNS 
> lookup causes added delays.
>
> I've also created a CodeSourceGrant which uses CodeSource.implies() for 
> backward compatibility with existing java policy files, however I'm sure that 
> most will simply want to revise their policy files.
>
> The standard interface PermissionGrant, is implemented by the following 
> inheritance hierarchy of immutable classes:
>
>                                  PrincipalGrant
>                  ______________|_______________________________
>                 
> |                                                                                           
> |
> ProtectionDomainGrant                                                         
> CertificateGrant
>                 |                                                         
> ________________ |________________
> ClassLoaderGrant                                              
> |                                                                  |
>                                                                   
> URIGrant                                              CodeSourceGrant
>
>
> Only PrincipalGrant is publicly visible, a builder returns the correct 
> implementation.
>
> ProtectionDomainGrant and ClassLoaderGrant are dynamically granted, by the 
> completely new DynamicPolicyProvider (which has long since passed all tests).
>
> CertificateGrant, URIGrant and CodeSourceGrant are used by the File based 
> policy's and RemotePolicy, which is intended to be a service that nodes in a 
> djinn can use to allow an administrator to update the policy (eg to include 
> new certificates or principals), with all the protection of subject 
> authentication and secure connections.  RemotePolicy is idempotent, the policy 
> is updated in one operation, so the current policy state is always known to 
> the administrator (who is a client).
>
> Since a File based policy is mostly read and only written when refreshed, 
> PermissionGrant's are held in a volatile array reference, copied (only the 
> reference) by any code that reads the array.  The array reference is updated 
> when the policy is updated, the array is never mutated after publishing.
>
> A ConcurrentMap<ProtectionDomain, PermissionCollection> (with weak keys) acts 
> as a cache, I've got ConcurrentPermissions, an implementation that replaces 
> the hetergenous java.security.Permissions class, this also resolves any 
> unresolved permissions.
>
> However I'm starting to wonder if it's wiser to throw away the cache 
> altogether and simply build java.security.Permissions on demand, then throw 
> Permissions away immediately after use for collection in the young generation 
> heap (it's likely to fit in level 2 cache and never even be copied to Ram).  
> This would eliminate contention between existing PermissionCollection's that 
> block, like SocketPermissionCollection.
>
> So if you have for instance 100 different AccessControlContext's being checked 
> by different threads, that all contain the same ProtectionDomain's for a 
> SocketPermission, then all will be executed in parallel.  Currently due to 
> blocking, each SocketPermission that performs a DNS check must either resolve 
> or timeout, before it's SocketPermissionCollection can release it's 
> synchronization lock (and there may be multiple SocketPermission's in a 
> SocketPermissionCollection), before another thread can check it's context and 
> so on, which explains everything coming to a standstill.
>
> If all permission checks execute in parallel independently, without blocking, 
> then the timeout won't be magnified.
>
> I am considering going one step further and replacing SocketPermission and 
> SocketPermissionCollection, and implementing DNS checks in the 
> SocketPermissionCollection rather than SocketPermission.  By doing this a 
> matching record will be found in most cases without requiring DNS reverse 
> lookup.  If I keep this as an internal policy implementation detail, then if 
> Oracle fixes SocketPermission, we can return to using the standard java 
> implementation, in fact I could make it a configuration property.
>
> It's an unfortunate fact that not all permission checks are performed in the 
> policy, replacing SocketPermission also requires the cooperation of the 
> SecurityManager.  To make matters worse, static ProtectionDomains created 
> prior to my policy implementation being constructed will never consult my 
> policy implementation as such they will still contain SocketPermission.   So 
> the SecurityManager would need to check each ProtectionDomain for both 
> implementations, so reimplementing SocketPermission doesn't eliminate its use 
> entirely.
>
> It's worth noting that SocketPermission is implemented rather poorly and the 
> same functionality can be provided with far fewer DNS lookups being performed, 
> since the majority are performed completely unnecessarily.  Perhaps it's worth 
> me donating some time to OpenJDK to fix it, I'd have to check with Apache 
> legal first I suppose.
>
> The problems with DNS lookup also affects CodeSource and URL equals and 
> hashcode methods, so these classes shouldn't be used in collections.
>
> Cheers,
>
> Peter.
>
> Christopher Dolan wrote:
>> To simulate the problem, go to InetAddress.getHostFromNameService() in your 
>> IDE, set a breakpoint on the "nameService.getHostByAddr" line with a 
>> condition of something like this:
>>
>>      new java.util.concurrent.CountDownLatch(1).await(15, 
>> java.util.concurrent.TimeUnit.SECONDS)
>>
>> then launch your River application from within the IDE. This will cause all 
>> reverse DNS lookups to stall for 15 seconds before succeeding. This will 
>> affect Reggie the worst because it has to verify so many hostnames. In a 
>> large group (a few thousand services) this will drive Reggie's thread count 
>> skyward, perhaps triggering OutOfMemory errors if it's in a 32-bit JVM.
>>
>> This problem happens in the real world in facilities that allow client 
>> connections to the production LAN, but do not allow the production LAN to 
>> resolve hosts in the client LAN. This may occur due to separate IT teams or 
>> strict security rules or simple configuration errors. Because most 
>> client-server systems, like web servers, do not require the server to contact 
>> the client this problem does not become immediately visible to IT. Instead, 
>> the question is inevitably "Why is Jini/River so sensitive to reverse DNS? 
>> All of my other services work fine."
>>
>> Chris
>>
>> -----Original Message-----
>> From: Tom Hobbs [mailto:tvhobbs@googlemail.com] Sent: Monday, December 12, 
>> 2011 1:43 PM
>> To: dev@river.apache.org
>> Subject: Re: RE: Implications for Security Checks - SocketPermission, URL and 
>> DNS lookups
>>
>> My biggest concern with such fundamental changes is controlling the impact
>> it will have.  I'm a pretty good example of this, I haven't experienced the
>> troubles these changes are intended to overcome.  I also don't havent made
>> any attempt to dive into these areas of the code, for any reason.
>>
>> Is it possible to put together a test case which exposes these problems and
>> also proves the solution?
>>
>> Obviously, a test case involving misconfigured networks is daft, in that
>> instance a handy "if your network misconfigured" diagnostic tool or
>> documentation would be a good idea.
>>
>> Please don't interpret this concern as a criticism of your work, Peter.
>> Far from it.  It's just a comment born out of not really having any contact
>> with the area your working in!
>>
>>
>> Grammar and spelling have been sacrificed on the altar of messaging via
>> mobile device.
>>
>> On 12 Dec 2011 18:01, "Christopher Dolan" <ch...@avid.com>
>> wrote:
>>
>>> Specifically for SocketPermission, I experienced severe timeout problems
>>> with reverse DNS misconfigurations. For some LAN-based deployments, I
>>> relaxed this criterion via 'new SocketPermission("*",
>>> "accept,listen,connect,resolve")'. This was difficult to apply to a general
>>> Sun/Oracle JVM, however, because the default security policy *prepends* a
>>> ("localhost:1024-","listen") permission that triggers the reverse DNS
>>> lookup. To avoid this inconvenient setting, I install a new
>>> java.security.Policy subclass that delegates to the default Policy except
>>> when the incoming permission is a SocketPermission. That way I don't need
>>> to modify the policy file in the JVM. The Policy.implies() override method
>>> is trivial because it just needs to do " if (permission instanceof
>>> SocketPermission) { ... }". The PermissionCollection methods were trickier
>>> to override (skip over any SocketPermission elements in the default
>>> Policy's PermissionCollection), but still only about 50 LOC.
>>>
>>> Chris
>>>
>>> -----Original Message-----
>>> From: Peter Firmstone [mailto:jini@zeus.net.au]
>>> Sent: Friday, December 09, 2011 9:28 PM
>>> To: dev@river.apache.org
>>> Subject: Implications for Security Checks - SocketPermission, URL and DNS
>>> lookups
>>>
>>> DNS lookups and reverse lookups caused by URL and SocketPermission,
>>> equals, hashCode and implies methods create some serious performance
>>> problems for distributed programs.
>>>
>>> The concurrent policy implementation I've been working on reduces lock
>>> contention between threads performing security checks.
>>>
>>> When the SecurityManager is used to check a guard, it calls the
>>> AccessController, which retrieves the AccessControlContext from the call
>>> stack, this contains all the ProtectionDomain's on the call stack (I
>>> won't go into privileged calls here), if a ProtectionDomain is dynamic
>>> it will consult the Policy, prior to checking the static permissions it
>>> contains.
>>>
>>> The problem with the old policy implementation is lock contention caused
>>> by multiple threads all using multiple ProtectionDomains, when the time
>>> taken to perform a check is considerable, especially where identical
>>> security checks might be performed by multiple threads executing the
>>> same code.
>>>
>>> Although concurrent policy reduces contention between ProtectionDomain's
>>> calls to Policy.implies, there remain some fundamental problems with the
>>> implementations of SocketPermission and URL, that cause unnecessary DNS
>>> lookups during equals(), hashCode() and implies() methods.
>>>
>>> The following bugs concern SocketPermission (please read before
>>> continuing) :
>>>
>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6592285
>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4975882 - contains a
>>> lot of valuable comments.
>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4671007 - fixed,
>>> perhaps incorrectly.
>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6501746
>>>
>>> Anyway to cut a long story short, DNS lookups and DNS reverse lookups
>>> are performed for the equals and hashCode implementations in
>>> SocketPermission and URL, with disastrous performance implications for
>>> policy implementations using collections and caching security permission
>>> check results.
>>>
>>> For example, once a SocketPermission guard has been checked for a
>>> specific AccessContolContext the result is cached by my SecurityManager,
>>> avoiding repeat security checks, however if that cache contains
>>> SocketPermission, DNS lookups will be required, the cache will perform
>>> slower than some other directly performed security checks!  The cache is
>>> intended to return quickly to avoid reconsulting every ProtectionDomain
>>> on the stack.
>>>
>>> To make matters worse, when checking a SocketPermission guard, the DNS
>>> may be consulted for every non wild card SocketPermission contained
>>> within a SocketPermissionCollection, up until it is implied.  DNS checks
>>> are being made unnecessarily, since the wild card that matches may not
>>> require a DNS lookup at all, but because the non matching
>>> SocketPermission's are being checked first, the DNS lookups and reverse
>>> lookups are still performed.  This could be fixed completely, by moving
>>> the responsibility of DNS lookups from SocketPermission to
>>> SocketPermissionCollection.
>>>
>>> The identity of two SocketPermission's are equal if they resolve to the
>>> same IP address, but their hashCode's are different! See bug 6592623.
>>>
>>> The identity of a SocketPermission with an IP address and a DNS name,
>>> resolving to identical IP address should not (in my opinion) be equal,
>>> but is!  One SocketPermission should only imply the other while DNS
>>> resolves to the same IP address, otherwise the equality of the two
>>> SocketPermission's will change if the IP address is assigned to a
>>> different domain!  Object equality / identity shouldn't depend on the
>>> result of a possibly unreliable network source.
>>>
>>> SocketPermission and SocketPermissionCollection are broken, the only
>>> solution I can think of is to re-implement these classes (from Harmony)
>>> in the policy and SecurityManager, substituting the existing jvm
>>> classes.  This would not be visible to client developers.
>>>
>>> SocketPermission's may also exist in a ProtectionDomain's static
>>> Permissions, these would have to be converted by the policy when merging
>>> the permissions from the ProtectionDomain with those from the policy.
>>> Since ProtectionDomain, attempts to check it's own internal permissions,
>>> after the policy permission check fails, DNS checks are currently
>>> performed by duplicate SocketPermission's residing in the
>>> ProectionDomain, this will no longer occur, since the permission being
>>> checked will be converted to say for argument sake
>>> org.apache.river.security.SocketPermission.  However because some
>>> ProtectionDomains are static, they never consult the policy, so the
>>> Permission's contained in each ProtectionDomain will require conversion
>>> also, to do so will require extending and implementing a
>>> ProtectionDomain that encapsulates existing ProtectionDomain's in the
>>> AccessControlContext, by utilising a DomainCombiner.
>>>
>>> For CodeSource grant's, the policy file based grant's are defined by
>>> URL's, however URL's identity depend upon DNS record results, similar to
>>> SocketPermission equals and hashCode implementations which we have no
>>> control over.
>>>
>>> I'm thinking about implementing URI based grant's instead, to avoid DNS
>>> lookups, then allowing a policy compatibility mode to be enabled (with
>>> logging) for falling back to CodeSource grant's when a URL cannot be
>>> converted to a URI, this is a much simpler fix than the SocketPermission
>>> problem.
>>>
>>> For Dynamic Policy Grants, because ProtectionDomain doesn't override
>>> equals (that's a good thing), the contained CodeSource must also be
>>> checked, again potentially slowing down permission checks with DNS
>>> lookups, simply because CodeSource uses URL's.  Changing the Dynamic
>>> Grant's to use URI based comparison would be relatively simple, since
>>> the URI is obtained dynamically when the dynamic grant is created.
>>>
>>> URI based grant's don't use DNS resolution and would have a narrower
>>> scope of implied CodeSources, an IP based grant won't imply a DNS domain
>>> URL based CodeSource and vice versa.  Rather than rely on DNS
>>> resolution, grant's could be made specifically for IPv4, IPv6 and DNS
>>> names in policy files.  URL.toURI() can be utilised to check if URI
>>> grant's imply a CodeSource without resorting to DNS.
>>>
>>> Any thoughts, comments or ideas?
>>>
>>> N.B. It's sad that security is implemented the way it is, it would be
>>> far better if it was Executor based, since every protection domain could
>>> be checked in parallel, rather than in sequence.
>>>
>>> Regards,
>>>
>>> Peter.
>>>
>>>
>>>
>>
>
>


Re: Implications for Security Checks - SocketPermission, URL and DNS lookups

Posted by Gregg Wonderly <gr...@wonderly.org>.
Remember to, from a general "workaround" perspective, that you can use command 
line options to "lengthen" the time that DNS failure information is retained, to 
keep things moving when no reverse DNS information is available.  The default, 
is like 10 seconds, and that is considerably shorter than what you will 
generally experience in a failed lookup.  The end result, is that the failure 
cache doesn't serve much purpose without it having a very extended time, as a 
workaround.   In some cases, I've set it to an hour or more, and some initial 
startup is then "slow", and initial client "connection" can be a little slow, 
but then things move along quite well.

Gregg Wonderly

On 12/13/2011 2:56 AM, Peter Firmstone wrote:
> In addition CodeSource.implies() also causes DNS checks, I'm not 100% sure 
> about the jvm code, but Harmony code uses SocketPermission.implies() to check 
> if one CodeSource implies another, I believe the jvm policy implementation 
> also utilises it, because harmony's implementation is built from Sun's java spec.
>
> So in the existing policy implementations, when parsing the policy files, 
> additional start up delays may be caused by the CodeSource.implies() method 
> making network DNS calls.
>
> In my ConcurrentPolicyFile implementation (to replace the standard java 
> PolicyFile implementation), I've created a URIGrant, I've taken code from 
> Harmony to implement implies(ProtectionDomain pd), that performs wildcard 
> matching compliant with CodeSource.implies, the only difference being, that no 
> attempt to resolve URI's is made.
>
> Typically most policy files specify file based URL's for CodeSource, however 
> in a network application where many CodeSources may be network URL's, DNS 
> lookup causes added delays.
>
> I've also created a CodeSourceGrant which uses CodeSource.implies() for 
> backward compatibility with existing java policy files, however I'm sure that 
> most will simply want to revise their policy files.
>
> The standard interface PermissionGrant, is implemented by the following 
> inheritance hierarchy of immutable classes:
>
>                                  PrincipalGrant
>                  ______________|_______________________________
>                 
> |                                                                                           
> |
> ProtectionDomainGrant                                                         
> CertificateGrant
>                 |                                                         
> ________________ |________________
> ClassLoaderGrant                                              
> |                                                                  |
>                                                                   
> URIGrant                                              CodeSourceGrant
>
>
> Only PrincipalGrant is publicly visible, a builder returns the correct 
> implementation.
>
> ProtectionDomainGrant and ClassLoaderGrant are dynamically granted, by the 
> completely new DynamicPolicyProvider (which has long since passed all tests).
>
> CertificateGrant, URIGrant and CodeSourceGrant are used by the File based 
> policy's and RemotePolicy, which is intended to be a service that nodes in a 
> djinn can use to allow an administrator to update the policy (eg to include 
> new certificates or principals), with all the protection of subject 
> authentication and secure connections.  RemotePolicy is idempotent, the policy 
> is updated in one operation, so the current policy state is always known to 
> the administrator (who is a client).
>
> Since a File based policy is mostly read and only written when refreshed, 
> PermissionGrant's are held in a volatile array reference, copied (only the 
> reference) by any code that reads the array.  The array reference is updated 
> when the policy is updated, the array is never mutated after publishing.
>
> A ConcurrentMap<ProtectionDomain, PermissionCollection> (with weak keys) acts 
> as a cache, I've got ConcurrentPermissions, an implementation that replaces 
> the hetergenous java.security.Permissions class, this also resolves any 
> unresolved permissions.
>
> However I'm starting to wonder if it's wiser to throw away the cache 
> altogether and simply build java.security.Permissions on demand, then throw 
> Permissions away immediately after use for collection in the young generation 
> heap (it's likely to fit in level 2 cache and never even be copied to Ram).  
> This would eliminate contention between existing PermissionCollection's that 
> block, like SocketPermissionCollection.
>
> So if you have for instance 100 different AccessControlContext's being checked 
> by different threads, that all contain the same ProtectionDomain's for a 
> SocketPermission, then all will be executed in parallel.  Currently due to 
> blocking, each SocketPermission that performs a DNS check must either resolve 
> or timeout, before it's SocketPermissionCollection can release it's 
> synchronization lock (and there may be multiple SocketPermission's in a 
> SocketPermissionCollection), before another thread can check it's context and 
> so on, which explains everything coming to a standstill.
>
> If all permission checks execute in parallel independently, without blocking, 
> then the timeout won't be magnified.
>
> I am considering going one step further and replacing SocketPermission and 
> SocketPermissionCollection, and implementing DNS checks in the 
> SocketPermissionCollection rather than SocketPermission.  By doing this a 
> matching record will be found in most cases without requiring DNS reverse 
> lookup.  If I keep this as an internal policy implementation detail, then if 
> Oracle fixes SocketPermission, we can return to using the standard java 
> implementation, in fact I could make it a configuration property.
>
> It's an unfortunate fact that not all permission checks are performed in the 
> policy, replacing SocketPermission also requires the cooperation of the 
> SecurityManager.  To make matters worse, static ProtectionDomains created 
> prior to my policy implementation being constructed will never consult my 
> policy implementation as such they will still contain SocketPermission.   So 
> the SecurityManager would need to check each ProtectionDomain for both 
> implementations, so reimplementing SocketPermission doesn't eliminate its use 
> entirely.
>
> It's worth noting that SocketPermission is implemented rather poorly and the 
> same functionality can be provided with far fewer DNS lookups being performed, 
> since the majority are performed completely unnecessarily.  Perhaps it's worth 
> me donating some time to OpenJDK to fix it, I'd have to check with Apache 
> legal first I suppose.
>
> The problems with DNS lookup also affects CodeSource and URL equals and 
> hashcode methods, so these classes shouldn't be used in collections.
>
> Cheers,
>
> Peter.
>
> Christopher Dolan wrote:
>> To simulate the problem, go to InetAddress.getHostFromNameService() in your 
>> IDE, set a breakpoint on the "nameService.getHostByAddr" line with a 
>> condition of something like this:
>>
>>      new java.util.concurrent.CountDownLatch(1).await(15, 
>> java.util.concurrent.TimeUnit.SECONDS)
>>
>> then launch your River application from within the IDE. This will cause all 
>> reverse DNS lookups to stall for 15 seconds before succeeding. This will 
>> affect Reggie the worst because it has to verify so many hostnames. In a 
>> large group (a few thousand services) this will drive Reggie's thread count 
>> skyward, perhaps triggering OutOfMemory errors if it's in a 32-bit JVM.
>>
>> This problem happens in the real world in facilities that allow client 
>> connections to the production LAN, but do not allow the production LAN to 
>> resolve hosts in the client LAN. This may occur due to separate IT teams or 
>> strict security rules or simple configuration errors. Because most 
>> client-server systems, like web servers, do not require the server to contact 
>> the client this problem does not become immediately visible to IT. Instead, 
>> the question is inevitably "Why is Jini/River so sensitive to reverse DNS? 
>> All of my other services work fine."
>>
>> Chris
>>
>> -----Original Message-----
>> From: Tom Hobbs [mailto:tvhobbs@googlemail.com] Sent: Monday, December 12, 
>> 2011 1:43 PM
>> To: dev@river.apache.org
>> Subject: Re: RE: Implications for Security Checks - SocketPermission, URL and 
>> DNS lookups
>>
>> My biggest concern with such fundamental changes is controlling the impact
>> it will have.  I'm a pretty good example of this, I haven't experienced the
>> troubles these changes are intended to overcome.  I also don't havent made
>> any attempt to dive into these areas of the code, for any reason.
>>
>> Is it possible to put together a test case which exposes these problems and
>> also proves the solution?
>>
>> Obviously, a test case involving misconfigured networks is daft, in that
>> instance a handy "if your network misconfigured" diagnostic tool or
>> documentation would be a good idea.
>>
>> Please don't interpret this concern as a criticism of your work, Peter.
>> Far from it.  It's just a comment born out of not really having any contact
>> with the area your working in!
>>
>>
>> Grammar and spelling have been sacrificed on the altar of messaging via
>> mobile device.
>>
>> On 12 Dec 2011 18:01, "Christopher Dolan" <ch...@avid.com>
>> wrote:
>>
>>> Specifically for SocketPermission, I experienced severe timeout problems
>>> with reverse DNS misconfigurations. For some LAN-based deployments, I
>>> relaxed this criterion via 'new SocketPermission("*",
>>> "accept,listen,connect,resolve")'. This was difficult to apply to a general
>>> Sun/Oracle JVM, however, because the default security policy *prepends* a
>>> ("localhost:1024-","listen") permission that triggers the reverse DNS
>>> lookup. To avoid this inconvenient setting, I install a new
>>> java.security.Policy subclass that delegates to the default Policy except
>>> when the incoming permission is a SocketPermission. That way I don't need
>>> to modify the policy file in the JVM. The Policy.implies() override method
>>> is trivial because it just needs to do " if (permission instanceof
>>> SocketPermission) { ... }". The PermissionCollection methods were trickier
>>> to override (skip over any SocketPermission elements in the default
>>> Policy's PermissionCollection), but still only about 50 LOC.
>>>
>>> Chris
>>>
>>> -----Original Message-----
>>> From: Peter Firmstone [mailto:jini@zeus.net.au]
>>> Sent: Friday, December 09, 2011 9:28 PM
>>> To: dev@river.apache.org
>>> Subject: Implications for Security Checks - SocketPermission, URL and DNS
>>> lookups
>>>
>>> DNS lookups and reverse lookups caused by URL and SocketPermission,
>>> equals, hashCode and implies methods create some serious performance
>>> problems for distributed programs.
>>>
>>> The concurrent policy implementation I've been working on reduces lock
>>> contention between threads performing security checks.
>>>
>>> When the SecurityManager is used to check a guard, it calls the
>>> AccessController, which retrieves the AccessControlContext from the call
>>> stack, this contains all the ProtectionDomain's on the call stack (I
>>> won't go into privileged calls here), if a ProtectionDomain is dynamic
>>> it will consult the Policy, prior to checking the static permissions it
>>> contains.
>>>
>>> The problem with the old policy implementation is lock contention caused
>>> by multiple threads all using multiple ProtectionDomains, when the time
>>> taken to perform a check is considerable, especially where identical
>>> security checks might be performed by multiple threads executing the
>>> same code.
>>>
>>> Although concurrent policy reduces contention between ProtectionDomain's
>>> calls to Policy.implies, there remain some fundamental problems with the
>>> implementations of SocketPermission and URL, that cause unnecessary DNS
>>> lookups during equals(), hashCode() and implies() methods.
>>>
>>> The following bugs concern SocketPermission (please read before
>>> continuing) :
>>>
>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6592285
>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4975882 - contains a
>>> lot of valuable comments.
>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4671007 - fixed,
>>> perhaps incorrectly.
>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6501746
>>>
>>> Anyway to cut a long story short, DNS lookups and DNS reverse lookups
>>> are performed for the equals and hashCode implementations in
>>> SocketPermission and URL, with disastrous performance implications for
>>> policy implementations using collections and caching security permission
>>> check results.
>>>
>>> For example, once a SocketPermission guard has been checked for a
>>> specific AccessContolContext the result is cached by my SecurityManager,
>>> avoiding repeat security checks, however if that cache contains
>>> SocketPermission, DNS lookups will be required, the cache will perform
>>> slower than some other directly performed security checks!  The cache is
>>> intended to return quickly to avoid reconsulting every ProtectionDomain
>>> on the stack.
>>>
>>> To make matters worse, when checking a SocketPermission guard, the DNS
>>> may be consulted for every non wild card SocketPermission contained
>>> within a SocketPermissionCollection, up until it is implied.  DNS checks
>>> are being made unnecessarily, since the wild card that matches may not
>>> require a DNS lookup at all, but because the non matching
>>> SocketPermission's are being checked first, the DNS lookups and reverse
>>> lookups are still performed.  This could be fixed completely, by moving
>>> the responsibility of DNS lookups from SocketPermission to
>>> SocketPermissionCollection.
>>>
>>> The identity of two SocketPermission's are equal if they resolve to the
>>> same IP address, but their hashCode's are different! See bug 6592623.
>>>
>>> The identity of a SocketPermission with an IP address and a DNS name,
>>> resolving to identical IP address should not (in my opinion) be equal,
>>> but is!  One SocketPermission should only imply the other while DNS
>>> resolves to the same IP address, otherwise the equality of the two
>>> SocketPermission's will change if the IP address is assigned to a
>>> different domain!  Object equality / identity shouldn't depend on the
>>> result of a possibly unreliable network source.
>>>
>>> SocketPermission and SocketPermissionCollection are broken, the only
>>> solution I can think of is to re-implement these classes (from Harmony)
>>> in the policy and SecurityManager, substituting the existing jvm
>>> classes.  This would not be visible to client developers.
>>>
>>> SocketPermission's may also exist in a ProtectionDomain's static
>>> Permissions, these would have to be converted by the policy when merging
>>> the permissions from the ProtectionDomain with those from the policy.
>>> Since ProtectionDomain, attempts to check it's own internal permissions,
>>> after the policy permission check fails, DNS checks are currently
>>> performed by duplicate SocketPermission's residing in the
>>> ProectionDomain, this will no longer occur, since the permission being
>>> checked will be converted to say for argument sake
>>> org.apache.river.security.SocketPermission.  However because some
>>> ProtectionDomains are static, they never consult the policy, so the
>>> Permission's contained in each ProtectionDomain will require conversion
>>> also, to do so will require extending and implementing a
>>> ProtectionDomain that encapsulates existing ProtectionDomain's in the
>>> AccessControlContext, by utilising a DomainCombiner.
>>>
>>> For CodeSource grant's, the policy file based grant's are defined by
>>> URL's, however URL's identity depend upon DNS record results, similar to
>>> SocketPermission equals and hashCode implementations which we have no
>>> control over.
>>>
>>> I'm thinking about implementing URI based grant's instead, to avoid DNS
>>> lookups, then allowing a policy compatibility mode to be enabled (with
>>> logging) for falling back to CodeSource grant's when a URL cannot be
>>> converted to a URI, this is a much simpler fix than the SocketPermission
>>> problem.
>>>
>>> For Dynamic Policy Grants, because ProtectionDomain doesn't override
>>> equals (that's a good thing), the contained CodeSource must also be
>>> checked, again potentially slowing down permission checks with DNS
>>> lookups, simply because CodeSource uses URL's.  Changing the Dynamic
>>> Grant's to use URI based comparison would be relatively simple, since
>>> the URI is obtained dynamically when the dynamic grant is created.
>>>
>>> URI based grant's don't use DNS resolution and would have a narrower
>>> scope of implied CodeSources, an IP based grant won't imply a DNS domain
>>> URL based CodeSource and vice versa.  Rather than rely on DNS
>>> resolution, grant's could be made specifically for IPv4, IPv6 and DNS
>>> names in policy files.  URL.toURI() can be utilised to check if URI
>>> grant's imply a CodeSource without resorting to DNS.
>>>
>>> Any thoughts, comments or ideas?
>>>
>>> N.B. It's sad that security is implemented the way it is, it would be
>>> far better if it was Executor based, since every protection domain could
>>> be checked in parallel, rather than in sequence.
>>>
>>> Regards,
>>>
>>> Peter.
>>>
>>>
>>>
>>
>
>


Re: Implications for Security Checks - SocketPermission, URL and DNS lookups

Posted by Peter Firmstone <ji...@zeus.net.au>.
In addition CodeSource.implies() also causes DNS checks, I'm not 100% 
sure about the jvm code, but Harmony code uses 
SocketPermission.implies() to check if one CodeSource implies another, I 
believe the jvm policy implementation also utilises it, because 
harmony's implementation is built from Sun's java spec.

So in the existing policy implementations, when parsing the policy 
files, additional start up delays may be caused by the 
CodeSource.implies() method making network DNS calls.

In my ConcurrentPolicyFile implementation (to replace the standard java 
PolicyFile implementation), I've created a URIGrant, I've taken code 
from Harmony to implement implies(ProtectionDomain pd), that performs 
wildcard matching compliant with CodeSource.implies, the only difference 
being, that no attempt to resolve URI's is made.

Typically most policy files specify file based URL's for CodeSource, 
however in a network application where many CodeSources may be network 
URL's, DNS lookup causes added delays.

I've also created a CodeSourceGrant which uses CodeSource.implies() for 
backward compatibility with existing java policy files, however I'm sure 
that most will simply want to revise their policy files.

The standard interface PermissionGrant, is implemented by the following 
inheritance hierarchy of immutable classes:

                                  PrincipalGrant
                  ______________|_______________________________
                 
|                                                                                           
|
ProtectionDomainGrant                                                         
CertificateGrant
                 |                                                       
   ________________ |________________
ClassLoaderGrant                                              
|                                                                  |
                                                                   
URIGrant                                              CodeSourceGrant


Only PrincipalGrant is publicly visible, a builder returns the correct 
implementation.

ProtectionDomainGrant and ClassLoaderGrant are dynamically granted, by 
the completely new DynamicPolicyProvider (which has long since passed 
all tests).

CertificateGrant, URIGrant and CodeSourceGrant are used by the File 
based policy's and RemotePolicy, which is intended to be a service that 
nodes in a djinn can use to allow an administrator to update the policy 
(eg to include new certificates or principals), with all the protection 
of subject authentication and secure connections.  RemotePolicy is 
idempotent, the policy is updated in one operation, so the current 
policy state is always known to the administrator (who is a client).

Since a File based policy is mostly read and only written when 
refreshed, PermissionGrant's are held in a volatile array reference, 
copied (only the reference) by any code that reads the array.  The array 
reference is updated when the policy is updated, the array is never 
mutated after publishing.

A ConcurrentMap<ProtectionDomain, PermissionCollection> (with weak keys) 
acts as a cache, I've got ConcurrentPermissions, an implementation that 
replaces the hetergenous java.security.Permissions class, this also 
resolves any unresolved permissions.

However I'm starting to wonder if it's wiser to throw away the cache 
altogether and simply build java.security.Permissions on demand, then 
throw Permissions away immediately after use for collection in the young 
generation heap (it's likely to fit in level 2 cache and never even be 
copied to Ram).  This would eliminate contention between existing 
PermissionCollection's that block, like SocketPermissionCollection.

So if you have for instance 100 different AccessControlContext's being 
checked by different threads, that all contain the same 
ProtectionDomain's for a SocketPermission, then all will be executed in 
parallel.  Currently due to blocking, each SocketPermission that 
performs a DNS check must either resolve or timeout, before it's 
SocketPermissionCollection can release it's synchronization lock (and 
there may be multiple SocketPermission's in a 
SocketPermissionCollection), before another thread can check it's 
context and so on, which explains everything coming to a standstill.

If all permission checks execute in parallel independently, without 
blocking, then the timeout won't be magnified.

I am considering going one step further and replacing SocketPermission 
and SocketPermissionCollection, and implementing DNS checks in the 
SocketPermissionCollection rather than SocketPermission.  By doing this 
a matching record will be found in most cases without requiring DNS 
reverse lookup.  If I keep this as an internal policy implementation 
detail, then if Oracle fixes SocketPermission, we can return to using 
the standard java implementation, in fact I could make it a 
configuration property.

It's an unfortunate fact that not all permission checks are performed in 
the policy, replacing SocketPermission also requires the cooperation of 
the SecurityManager.  To make matters worse, static ProtectionDomains 
created prior to my policy implementation being constructed will never 
consult my policy implementation as such they will still contain 
SocketPermission.   So the SecurityManager would need to check each 
ProtectionDomain for both implementations, so reimplementing 
SocketPermission doesn't eliminate its use entirely.

It's worth noting that SocketPermission is implemented rather poorly and 
the same functionality can be provided with far fewer DNS lookups being 
performed, since the majority are performed completely unnecessarily.  
Perhaps it's worth me donating some time to OpenJDK to fix it, I'd have 
to check with Apache legal first I suppose.

The problems with DNS lookup also affects CodeSource and URL equals and 
hashcode methods, so these classes shouldn't be used in collections.

Cheers,

Peter.

Christopher Dolan wrote:
> To simulate the problem, go to InetAddress.getHostFromNameService() in your IDE, set a breakpoint on the "nameService.getHostByAddr" line with a condition of something like this:
>
>      new java.util.concurrent.CountDownLatch(1).await(15, java.util.concurrent.TimeUnit.SECONDS)
>
> then launch your River application from within the IDE. This will cause all reverse DNS lookups to stall for 15 seconds before succeeding. This will affect Reggie the worst because it has to verify so many hostnames. In a large group (a few thousand services) this will drive Reggie's thread count skyward, perhaps triggering OutOfMemory errors if it's in a 32-bit JVM.
>
> This problem happens in the real world in facilities that allow client connections to the production LAN, but do not allow the production LAN to resolve hosts in the client LAN. This may occur due to separate IT teams or strict security rules or simple configuration errors. Because most client-server systems, like web servers, do not require the server to contact the client this problem does not become immediately visible to IT. Instead, the question is inevitably "Why is Jini/River so sensitive to reverse DNS? All of my other services work fine."
>
> Chris
>
> -----Original Message-----
> From: Tom Hobbs [mailto:tvhobbs@googlemail.com] 
> Sent: Monday, December 12, 2011 1:43 PM
> To: dev@river.apache.org
> Subject: Re: RE: Implications for Security Checks - SocketPermission, URL and DNS lookups
>
> My biggest concern with such fundamental changes is controlling the impact
> it will have.  I'm a pretty good example of this, I haven't experienced the
> troubles these changes are intended to overcome.  I also don't havent made
> any attempt to dive into these areas of the code, for any reason.
>
> Is it possible to put together a test case which exposes these problems and
> also proves the solution?
>
> Obviously, a test case involving misconfigured networks is daft, in that
> instance a handy "if your network misconfigured" diagnostic tool or
> documentation would be a good idea.
>
> Please don't interpret this concern as a criticism of your work, Peter.
> Far from it.  It's just a comment born out of not really having any contact
> with the area your working in!
>
>
> Grammar and spelling have been sacrificed on the altar of messaging via
> mobile device.
>
> On 12 Dec 2011 18:01, "Christopher Dolan" <ch...@avid.com>
> wrote:
>
>   
>> Specifically for SocketPermission, I experienced severe timeout problems
>> with reverse DNS misconfigurations. For some LAN-based deployments, I
>> relaxed this criterion via 'new SocketPermission("*",
>> "accept,listen,connect,resolve")'. This was difficult to apply to a general
>> Sun/Oracle JVM, however, because the default security policy *prepends* a
>> ("localhost:1024-","listen") permission that triggers the reverse DNS
>> lookup. To avoid this inconvenient setting, I install a new
>> java.security.Policy subclass that delegates to the default Policy except
>> when the incoming permission is a SocketPermission. That way I don't need
>> to modify the policy file in the JVM. The Policy.implies() override method
>> is trivial because it just needs to do " if (permission instanceof
>> SocketPermission) { ... }". The PermissionCollection methods were trickier
>> to override (skip over any SocketPermission elements in the default
>> Policy's PermissionCollection), but still only about 50 LOC.
>>
>> Chris
>>
>> -----Original Message-----
>> From: Peter Firmstone [mailto:jini@zeus.net.au]
>> Sent: Friday, December 09, 2011 9:28 PM
>> To: dev@river.apache.org
>> Subject: Implications for Security Checks - SocketPermission, URL and DNS
>> lookups
>>
>> DNS lookups and reverse lookups caused by URL and SocketPermission,
>> equals, hashCode and implies methods create some serious performance
>> problems for distributed programs.
>>
>> The concurrent policy implementation I've been working on reduces lock
>> contention between threads performing security checks.
>>
>> When the SecurityManager is used to check a guard, it calls the
>> AccessController, which retrieves the AccessControlContext from the call
>> stack, this contains all the ProtectionDomain's on the call stack (I
>> won't go into privileged calls here), if a ProtectionDomain is dynamic
>> it will consult the Policy, prior to checking the static permissions it
>> contains.
>>
>> The problem with the old policy implementation is lock contention caused
>> by multiple threads all using multiple ProtectionDomains, when the time
>> taken to perform a check is considerable, especially where identical
>> security checks might be performed by multiple threads executing the
>> same code.
>>
>> Although concurrent policy reduces contention between ProtectionDomain's
>> calls to Policy.implies, there remain some fundamental problems with the
>> implementations of SocketPermission and URL, that cause unnecessary DNS
>> lookups during equals(), hashCode() and implies() methods.
>>
>> The following bugs concern SocketPermission (please read before
>> continuing) :
>>
>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6592285
>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4975882 - contains a
>> lot of valuable comments.
>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4671007 - fixed,
>> perhaps incorrectly.
>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6501746
>>
>> Anyway to cut a long story short, DNS lookups and DNS reverse lookups
>> are performed for the equals and hashCode implementations in
>> SocketPermission and URL, with disastrous performance implications for
>> policy implementations using collections and caching security permission
>> check results.
>>
>> For example, once a SocketPermission guard has been checked for a
>> specific AccessContolContext the result is cached by my SecurityManager,
>> avoiding repeat security checks, however if that cache contains
>> SocketPermission, DNS lookups will be required, the cache will perform
>> slower than some other directly performed security checks!  The cache is
>> intended to return quickly to avoid reconsulting every ProtectionDomain
>> on the stack.
>>
>> To make matters worse, when checking a SocketPermission guard, the DNS
>> may be consulted for every non wild card SocketPermission contained
>> within a SocketPermissionCollection, up until it is implied.  DNS checks
>> are being made unnecessarily, since the wild card that matches may not
>> require a DNS lookup at all, but because the non matching
>> SocketPermission's are being checked first, the DNS lookups and reverse
>> lookups are still performed.  This could be fixed completely, by moving
>> the responsibility of DNS lookups from SocketPermission to
>> SocketPermissionCollection.
>>
>> The identity of two SocketPermission's are equal if they resolve to the
>> same IP address, but their hashCode's are different! See bug 6592623.
>>
>> The identity of a SocketPermission with an IP address and a DNS name,
>> resolving to identical IP address should not (in my opinion) be equal,
>> but is!  One SocketPermission should only imply the other while DNS
>> resolves to the same IP address, otherwise the equality of the two
>> SocketPermission's will change if the IP address is assigned to a
>> different domain!  Object equality / identity shouldn't depend on the
>> result of a possibly unreliable network source.
>>
>> SocketPermission and SocketPermissionCollection are broken, the only
>> solution I can think of is to re-implement these classes (from Harmony)
>> in the policy and SecurityManager, substituting the existing jvm
>> classes.  This would not be visible to client developers.
>>
>> SocketPermission's may also exist in a ProtectionDomain's static
>> Permissions, these would have to be converted by the policy when merging
>> the permissions from the ProtectionDomain with those from the policy.
>> Since ProtectionDomain, attempts to check it's own internal permissions,
>> after the policy permission check fails, DNS checks are currently
>> performed by duplicate SocketPermission's residing in the
>> ProectionDomain, this will no longer occur, since the permission being
>> checked will be converted to say for argument sake
>> org.apache.river.security.SocketPermission.  However because some
>> ProtectionDomains are static, they never consult the policy, so the
>> Permission's contained in each ProtectionDomain will require conversion
>> also, to do so will require extending and implementing a
>> ProtectionDomain that encapsulates existing ProtectionDomain's in the
>> AccessControlContext, by utilising a DomainCombiner.
>>
>> For CodeSource grant's, the policy file based grant's are defined by
>> URL's, however URL's identity depend upon DNS record results, similar to
>> SocketPermission equals and hashCode implementations which we have no
>> control over.
>>
>> I'm thinking about implementing URI based grant's instead, to avoid DNS
>> lookups, then allowing a policy compatibility mode to be enabled (with
>> logging) for falling back to CodeSource grant's when a URL cannot be
>> converted to a URI, this is a much simpler fix than the SocketPermission
>> problem.
>>
>> For Dynamic Policy Grants, because ProtectionDomain doesn't override
>> equals (that's a good thing), the contained CodeSource must also be
>> checked, again potentially slowing down permission checks with DNS
>> lookups, simply because CodeSource uses URL's.  Changing the Dynamic
>> Grant's to use URI based comparison would be relatively simple, since
>> the URI is obtained dynamically when the dynamic grant is created.
>>
>> URI based grant's don't use DNS resolution and would have a narrower
>> scope of implied CodeSources, an IP based grant won't imply a DNS domain
>> URL based CodeSource and vice versa.  Rather than rely on DNS
>> resolution, grant's could be made specifically for IPv4, IPv6 and DNS
>> names in policy files.  URL.toURI() can be utilised to check if URI
>> grant's imply a CodeSource without resorting to DNS.
>>
>> Any thoughts, comments or ideas?
>>
>> N.B. It's sad that security is implemented the way it is, it would be
>> far better if it was Executor based, since every protection domain could
>> be checked in parallel, rather than in sequence.
>>
>> Regards,
>>
>> Peter.
>>
>>
>>
>>     
>
>   


RE: RE: Implications for Security Checks - SocketPermission, URL and DNS lookups

Posted by Christopher Dolan <ch...@avid.com>.
To simulate the problem, go to InetAddress.getHostFromNameService() in your IDE, set a breakpoint on the "nameService.getHostByAddr" line with a condition of something like this:

     new java.util.concurrent.CountDownLatch(1).await(15, java.util.concurrent.TimeUnit.SECONDS)

then launch your River application from within the IDE. This will cause all reverse DNS lookups to stall for 15 seconds before succeeding. This will affect Reggie the worst because it has to verify so many hostnames. In a large group (a few thousand services) this will drive Reggie's thread count skyward, perhaps triggering OutOfMemory errors if it's in a 32-bit JVM.

This problem happens in the real world in facilities that allow client connections to the production LAN, but do not allow the production LAN to resolve hosts in the client LAN. This may occur due to separate IT teams or strict security rules or simple configuration errors. Because most client-server systems, like web servers, do not require the server to contact the client this problem does not become immediately visible to IT. Instead, the question is inevitably "Why is Jini/River so sensitive to reverse DNS? All of my other services work fine."

Chris

-----Original Message-----
From: Tom Hobbs [mailto:tvhobbs@googlemail.com] 
Sent: Monday, December 12, 2011 1:43 PM
To: dev@river.apache.org
Subject: Re: RE: Implications for Security Checks - SocketPermission, URL and DNS lookups

My biggest concern with such fundamental changes is controlling the impact
it will have.  I'm a pretty good example of this, I haven't experienced the
troubles these changes are intended to overcome.  I also don't havent made
any attempt to dive into these areas of the code, for any reason.

Is it possible to put together a test case which exposes these problems and
also proves the solution?

Obviously, a test case involving misconfigured networks is daft, in that
instance a handy "if your network misconfigured" diagnostic tool or
documentation would be a good idea.

Please don't interpret this concern as a criticism of your work, Peter.
Far from it.  It's just a comment born out of not really having any contact
with the area your working in!


Grammar and spelling have been sacrificed on the altar of messaging via
mobile device.

On 12 Dec 2011 18:01, "Christopher Dolan" <ch...@avid.com>
wrote:

> Specifically for SocketPermission, I experienced severe timeout problems
> with reverse DNS misconfigurations. For some LAN-based deployments, I
> relaxed this criterion via 'new SocketPermission("*",
> "accept,listen,connect,resolve")'. This was difficult to apply to a general
> Sun/Oracle JVM, however, because the default security policy *prepends* a
> ("localhost:1024-","listen") permission that triggers the reverse DNS
> lookup. To avoid this inconvenient setting, I install a new
> java.security.Policy subclass that delegates to the default Policy except
> when the incoming permission is a SocketPermission. That way I don't need
> to modify the policy file in the JVM. The Policy.implies() override method
> is trivial because it just needs to do " if (permission instanceof
> SocketPermission) { ... }". The PermissionCollection methods were trickier
> to override (skip over any SocketPermission elements in the default
> Policy's PermissionCollection), but still only about 50 LOC.
>
> Chris
>
> -----Original Message-----
> From: Peter Firmstone [mailto:jini@zeus.net.au]
> Sent: Friday, December 09, 2011 9:28 PM
> To: dev@river.apache.org
> Subject: Implications for Security Checks - SocketPermission, URL and DNS
> lookups
>
> DNS lookups and reverse lookups caused by URL and SocketPermission,
> equals, hashCode and implies methods create some serious performance
> problems for distributed programs.
>
> The concurrent policy implementation I've been working on reduces lock
> contention between threads performing security checks.
>
> When the SecurityManager is used to check a guard, it calls the
> AccessController, which retrieves the AccessControlContext from the call
> stack, this contains all the ProtectionDomain's on the call stack (I
> won't go into privileged calls here), if a ProtectionDomain is dynamic
> it will consult the Policy, prior to checking the static permissions it
> contains.
>
> The problem with the old policy implementation is lock contention caused
> by multiple threads all using multiple ProtectionDomains, when the time
> taken to perform a check is considerable, especially where identical
> security checks might be performed by multiple threads executing the
> same code.
>
> Although concurrent policy reduces contention between ProtectionDomain's
> calls to Policy.implies, there remain some fundamental problems with the
> implementations of SocketPermission and URL, that cause unnecessary DNS
> lookups during equals(), hashCode() and implies() methods.
>
> The following bugs concern SocketPermission (please read before
> continuing) :
>
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6592285
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4975882 - contains a
> lot of valuable comments.
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4671007 - fixed,
> perhaps incorrectly.
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6501746
>
> Anyway to cut a long story short, DNS lookups and DNS reverse lookups
> are performed for the equals and hashCode implementations in
> SocketPermission and URL, with disastrous performance implications for
> policy implementations using collections and caching security permission
> check results.
>
> For example, once a SocketPermission guard has been checked for a
> specific AccessContolContext the result is cached by my SecurityManager,
> avoiding repeat security checks, however if that cache contains
> SocketPermission, DNS lookups will be required, the cache will perform
> slower than some other directly performed security checks!  The cache is
> intended to return quickly to avoid reconsulting every ProtectionDomain
> on the stack.
>
> To make matters worse, when checking a SocketPermission guard, the DNS
> may be consulted for every non wild card SocketPermission contained
> within a SocketPermissionCollection, up until it is implied.  DNS checks
> are being made unnecessarily, since the wild card that matches may not
> require a DNS lookup at all, but because the non matching
> SocketPermission's are being checked first, the DNS lookups and reverse
> lookups are still performed.  This could be fixed completely, by moving
> the responsibility of DNS lookups from SocketPermission to
> SocketPermissionCollection.
>
> The identity of two SocketPermission's are equal if they resolve to the
> same IP address, but their hashCode's are different! See bug 6592623.
>
> The identity of a SocketPermission with an IP address and a DNS name,
> resolving to identical IP address should not (in my opinion) be equal,
> but is!  One SocketPermission should only imply the other while DNS
> resolves to the same IP address, otherwise the equality of the two
> SocketPermission's will change if the IP address is assigned to a
> different domain!  Object equality / identity shouldn't depend on the
> result of a possibly unreliable network source.
>
> SocketPermission and SocketPermissionCollection are broken, the only
> solution I can think of is to re-implement these classes (from Harmony)
> in the policy and SecurityManager, substituting the existing jvm
> classes.  This would not be visible to client developers.
>
> SocketPermission's may also exist in a ProtectionDomain's static
> Permissions, these would have to be converted by the policy when merging
> the permissions from the ProtectionDomain with those from the policy.
> Since ProtectionDomain, attempts to check it's own internal permissions,
> after the policy permission check fails, DNS checks are currently
> performed by duplicate SocketPermission's residing in the
> ProectionDomain, this will no longer occur, since the permission being
> checked will be converted to say for argument sake
> org.apache.river.security.SocketPermission.  However because some
> ProtectionDomains are static, they never consult the policy, so the
> Permission's contained in each ProtectionDomain will require conversion
> also, to do so will require extending and implementing a
> ProtectionDomain that encapsulates existing ProtectionDomain's in the
> AccessControlContext, by utilising a DomainCombiner.
>
> For CodeSource grant's, the policy file based grant's are defined by
> URL's, however URL's identity depend upon DNS record results, similar to
> SocketPermission equals and hashCode implementations which we have no
> control over.
>
> I'm thinking about implementing URI based grant's instead, to avoid DNS
> lookups, then allowing a policy compatibility mode to be enabled (with
> logging) for falling back to CodeSource grant's when a URL cannot be
> converted to a URI, this is a much simpler fix than the SocketPermission
> problem.
>
> For Dynamic Policy Grants, because ProtectionDomain doesn't override
> equals (that's a good thing), the contained CodeSource must also be
> checked, again potentially slowing down permission checks with DNS
> lookups, simply because CodeSource uses URL's.  Changing the Dynamic
> Grant's to use URI based comparison would be relatively simple, since
> the URI is obtained dynamically when the dynamic grant is created.
>
> URI based grant's don't use DNS resolution and would have a narrower
> scope of implied CodeSources, an IP based grant won't imply a DNS domain
> URL based CodeSource and vice versa.  Rather than rely on DNS
> resolution, grant's could be made specifically for IPv4, IPv6 and DNS
> names in policy files.  URL.toURI() can be utilised to check if URI
> grant's imply a CodeSource without resorting to DNS.
>
> Any thoughts, comments or ideas?
>
> N.B. It's sad that security is implemented the way it is, it would be
> far better if it was Executor based, since every protection domain could
> be checked in parallel, rather than in sequence.
>
> Regards,
>
> Peter.
>
>
>

Re: RE: Implications for Security Checks - SocketPermission, URL and DNS lookups

Posted by Tom Hobbs <tv...@googlemail.com>.
My biggest concern with such fundamental changes is controlling the impact
it will have.  I'm a pretty good example of this, I haven't experienced the
troubles these changes are intended to overcome.  I also don't havent made
any attempt to dive into these areas of the code, for any reason.

Is it possible to put together a test case which exposes these problems and
also proves the solution?

Obviously, a test case involving misconfigured networks is daft, in that
instance a handy "if your network misconfigured" diagnostic tool or
documentation would be a good idea.

Please don't interpret this concern as a criticism of your work, Peter.
Far from it.  It's just a comment born out of not really having any contact
with the area your working in!


Grammar and spelling have been sacrificed on the altar of messaging via
mobile device.

On 12 Dec 2011 18:01, "Christopher Dolan" <ch...@avid.com>
wrote:

> Specifically for SocketPermission, I experienced severe timeout problems
> with reverse DNS misconfigurations. For some LAN-based deployments, I
> relaxed this criterion via 'new SocketPermission("*",
> "accept,listen,connect,resolve")'. This was difficult to apply to a general
> Sun/Oracle JVM, however, because the default security policy *prepends* a
> ("localhost:1024-","listen") permission that triggers the reverse DNS
> lookup. To avoid this inconvenient setting, I install a new
> java.security.Policy subclass that delegates to the default Policy except
> when the incoming permission is a SocketPermission. That way I don't need
> to modify the policy file in the JVM. The Policy.implies() override method
> is trivial because it just needs to do " if (permission instanceof
> SocketPermission) { ... }". The PermissionCollection methods were trickier
> to override (skip over any SocketPermission elements in the default
> Policy's PermissionCollection), but still only about 50 LOC.
>
> Chris
>
> -----Original Message-----
> From: Peter Firmstone [mailto:jini@zeus.net.au]
> Sent: Friday, December 09, 2011 9:28 PM
> To: dev@river.apache.org
> Subject: Implications for Security Checks - SocketPermission, URL and DNS
> lookups
>
> DNS lookups and reverse lookups caused by URL and SocketPermission,
> equals, hashCode and implies methods create some serious performance
> problems for distributed programs.
>
> The concurrent policy implementation I've been working on reduces lock
> contention between threads performing security checks.
>
> When the SecurityManager is used to check a guard, it calls the
> AccessController, which retrieves the AccessControlContext from the call
> stack, this contains all the ProtectionDomain's on the call stack (I
> won't go into privileged calls here), if a ProtectionDomain is dynamic
> it will consult the Policy, prior to checking the static permissions it
> contains.
>
> The problem with the old policy implementation is lock contention caused
> by multiple threads all using multiple ProtectionDomains, when the time
> taken to perform a check is considerable, especially where identical
> security checks might be performed by multiple threads executing the
> same code.
>
> Although concurrent policy reduces contention between ProtectionDomain's
> calls to Policy.implies, there remain some fundamental problems with the
> implementations of SocketPermission and URL, that cause unnecessary DNS
> lookups during equals(), hashCode() and implies() methods.
>
> The following bugs concern SocketPermission (please read before
> continuing) :
>
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6592285
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4975882 - contains a
> lot of valuable comments.
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4671007 - fixed,
> perhaps incorrectly.
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6501746
>
> Anyway to cut a long story short, DNS lookups and DNS reverse lookups
> are performed for the equals and hashCode implementations in
> SocketPermission and URL, with disastrous performance implications for
> policy implementations using collections and caching security permission
> check results.
>
> For example, once a SocketPermission guard has been checked for a
> specific AccessContolContext the result is cached by my SecurityManager,
> avoiding repeat security checks, however if that cache contains
> SocketPermission, DNS lookups will be required, the cache will perform
> slower than some other directly performed security checks!  The cache is
> intended to return quickly to avoid reconsulting every ProtectionDomain
> on the stack.
>
> To make matters worse, when checking a SocketPermission guard, the DNS
> may be consulted for every non wild card SocketPermission contained
> within a SocketPermissionCollection, up until it is implied.  DNS checks
> are being made unnecessarily, since the wild card that matches may not
> require a DNS lookup at all, but because the non matching
> SocketPermission's are being checked first, the DNS lookups and reverse
> lookups are still performed.  This could be fixed completely, by moving
> the responsibility of DNS lookups from SocketPermission to
> SocketPermissionCollection.
>
> The identity of two SocketPermission's are equal if they resolve to the
> same IP address, but their hashCode's are different! See bug 6592623.
>
> The identity of a SocketPermission with an IP address and a DNS name,
> resolving to identical IP address should not (in my opinion) be equal,
> but is!  One SocketPermission should only imply the other while DNS
> resolves to the same IP address, otherwise the equality of the two
> SocketPermission's will change if the IP address is assigned to a
> different domain!  Object equality / identity shouldn't depend on the
> result of a possibly unreliable network source.
>
> SocketPermission and SocketPermissionCollection are broken, the only
> solution I can think of is to re-implement these classes (from Harmony)
> in the policy and SecurityManager, substituting the existing jvm
> classes.  This would not be visible to client developers.
>
> SocketPermission's may also exist in a ProtectionDomain's static
> Permissions, these would have to be converted by the policy when merging
> the permissions from the ProtectionDomain with those from the policy.
> Since ProtectionDomain, attempts to check it's own internal permissions,
> after the policy permission check fails, DNS checks are currently
> performed by duplicate SocketPermission's residing in the
> ProectionDomain, this will no longer occur, since the permission being
> checked will be converted to say for argument sake
> org.apache.river.security.SocketPermission.  However because some
> ProtectionDomains are static, they never consult the policy, so the
> Permission's contained in each ProtectionDomain will require conversion
> also, to do so will require extending and implementing a
> ProtectionDomain that encapsulates existing ProtectionDomain's in the
> AccessControlContext, by utilising a DomainCombiner.
>
> For CodeSource grant's, the policy file based grant's are defined by
> URL's, however URL's identity depend upon DNS record results, similar to
> SocketPermission equals and hashCode implementations which we have no
> control over.
>
> I'm thinking about implementing URI based grant's instead, to avoid DNS
> lookups, then allowing a policy compatibility mode to be enabled (with
> logging) for falling back to CodeSource grant's when a URL cannot be
> converted to a URI, this is a much simpler fix than the SocketPermission
> problem.
>
> For Dynamic Policy Grants, because ProtectionDomain doesn't override
> equals (that's a good thing), the contained CodeSource must also be
> checked, again potentially slowing down permission checks with DNS
> lookups, simply because CodeSource uses URL's.  Changing the Dynamic
> Grant's to use URI based comparison would be relatively simple, since
> the URI is obtained dynamically when the dynamic grant is created.
>
> URI based grant's don't use DNS resolution and would have a narrower
> scope of implied CodeSources, an IP based grant won't imply a DNS domain
> URL based CodeSource and vice versa.  Rather than rely on DNS
> resolution, grant's could be made specifically for IPv4, IPv6 and DNS
> names in policy files.  URL.toURI() can be utilised to check if URI
> grant's imply a CodeSource without resorting to DNS.
>
> Any thoughts, comments or ideas?
>
> N.B. It's sad that security is implemented the way it is, it would be
> far better if it was Executor based, since every protection domain could
> be checked in parallel, rather than in sequence.
>
> Regards,
>
> Peter.
>
>
>

RE: Implications for Security Checks - SocketPermission, URL and DNS lookups

Posted by Christopher Dolan <ch...@avid.com>.
Specifically for SocketPermission, I experienced severe timeout problems with reverse DNS misconfigurations. For some LAN-based deployments, I relaxed this criterion via 'new SocketPermission("*", "accept,listen,connect,resolve")'. This was difficult to apply to a general Sun/Oracle JVM, however, because the default security policy *prepends* a ("localhost:1024-","listen") permission that triggers the reverse DNS lookup. To avoid this inconvenient setting, I install a new java.security.Policy subclass that delegates to the default Policy except when the incoming permission is a SocketPermission. That way I don't need to modify the policy file in the JVM. The Policy.implies() override method is trivial because it just needs to do " if (permission instanceof SocketPermission) { ... }". The PermissionCollection methods were trickier to override (skip over any SocketPermission elements in the default Policy's PermissionCollection), but still only about 50 LOC.

Chris

-----Original Message-----
From: Peter Firmstone [mailto:jini@zeus.net.au] 
Sent: Friday, December 09, 2011 9:28 PM
To: dev@river.apache.org
Subject: Implications for Security Checks - SocketPermission, URL and DNS lookups

DNS lookups and reverse lookups caused by URL and SocketPermission, 
equals, hashCode and implies methods create some serious performance 
problems for distributed programs.

The concurrent policy implementation I've been working on reduces lock 
contention between threads performing security checks.

When the SecurityManager is used to check a guard, it calls the 
AccessController, which retrieves the AccessControlContext from the call 
stack, this contains all the ProtectionDomain's on the call stack (I 
won't go into privileged calls here), if a ProtectionDomain is dynamic 
it will consult the Policy, prior to checking the static permissions it 
contains.

The problem with the old policy implementation is lock contention caused 
by multiple threads all using multiple ProtectionDomains, when the time 
taken to perform a check is considerable, especially where identical 
security checks might be performed by multiple threads executing the 
same code.

Although concurrent policy reduces contention between ProtectionDomain's 
calls to Policy.implies, there remain some fundamental problems with the 
implementations of SocketPermission and URL, that cause unnecessary DNS 
lookups during equals(), hashCode() and implies() methods.

The following bugs concern SocketPermission (please read before 
continuing) :

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6592285
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4975882 - contains a 
lot of valuable comments.
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4671007 - fixed, 
perhaps incorrectly.
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6501746

Anyway to cut a long story short, DNS lookups and DNS reverse lookups 
are performed for the equals and hashCode implementations in 
SocketPermission and URL, with disastrous performance implications for 
policy implementations using collections and caching security permission 
check results. 

For example, once a SocketPermission guard has been checked for a 
specific AccessContolContext the result is cached by my SecurityManager, 
avoiding repeat security checks, however if that cache contains 
SocketPermission, DNS lookups will be required, the cache will perform 
slower than some other directly performed security checks!  The cache is 
intended to return quickly to avoid reconsulting every ProtectionDomain 
on the stack.

To make matters worse, when checking a SocketPermission guard, the DNS 
may be consulted for every non wild card SocketPermission contained 
within a SocketPermissionCollection, up until it is implied.  DNS checks 
are being made unnecessarily, since the wild card that matches may not 
require a DNS lookup at all, but because the non matching 
SocketPermission's are being checked first, the DNS lookups and reverse 
lookups are still performed.  This could be fixed completely, by moving 
the responsibility of DNS lookups from SocketPermission to 
SocketPermissionCollection.

The identity of two SocketPermission's are equal if they resolve to the 
same IP address, but their hashCode's are different! See bug 6592623.

The identity of a SocketPermission with an IP address and a DNS name, 
resolving to identical IP address should not (in my opinion) be equal, 
but is!  One SocketPermission should only imply the other while DNS 
resolves to the same IP address, otherwise the equality of the two 
SocketPermission's will change if the IP address is assigned to a 
different domain!  Object equality / identity shouldn't depend on the 
result of a possibly unreliable network source.

SocketPermission and SocketPermissionCollection are broken, the only 
solution I can think of is to re-implement these classes (from Harmony) 
in the policy and SecurityManager, substituting the existing jvm 
classes.  This would not be visible to client developers.

SocketPermission's may also exist in a ProtectionDomain's static 
Permissions, these would have to be converted by the policy when merging 
the permissions from the ProtectionDomain with those from the policy.  
Since ProtectionDomain, attempts to check it's own internal permissions, 
after the policy permission check fails, DNS checks are currently 
performed by duplicate SocketPermission's residing in the 
ProectionDomain, this will no longer occur, since the permission being 
checked will be converted to say for argument sake 
org.apache.river.security.SocketPermission.  However because some 
ProtectionDomains are static, they never consult the policy, so the 
Permission's contained in each ProtectionDomain will require conversion 
also, to do so will require extending and implementing a 
ProtectionDomain that encapsulates existing ProtectionDomain's in the 
AccessControlContext, by utilising a DomainCombiner.

For CodeSource grant's, the policy file based grant's are defined by 
URL's, however URL's identity depend upon DNS record results, similar to 
SocketPermission equals and hashCode implementations which we have no 
control over.

I'm thinking about implementing URI based grant's instead, to avoid DNS 
lookups, then allowing a policy compatibility mode to be enabled (with 
logging) for falling back to CodeSource grant's when a URL cannot be 
converted to a URI, this is a much simpler fix than the SocketPermission 
problem.

For Dynamic Policy Grants, because ProtectionDomain doesn't override 
equals (that's a good thing), the contained CodeSource must also be 
checked, again potentially slowing down permission checks with DNS 
lookups, simply because CodeSource uses URL's.  Changing the Dynamic 
Grant's to use URI based comparison would be relatively simple, since 
the URI is obtained dynamically when the dynamic grant is created.

URI based grant's don't use DNS resolution and would have a narrower 
scope of implied CodeSources, an IP based grant won't imply a DNS domain 
URL based CodeSource and vice versa.  Rather than rely on DNS 
resolution, grant's could be made specifically for IPv4, IPv6 and DNS 
names in policy files.  URL.toURI() can be utilised to check if URI 
grant's imply a CodeSource without resorting to DNS.

Any thoughts, comments or ideas?

N.B. It's sad that security is implemented the way it is, it would be 
far better if it was Executor based, since every protection domain could 
be checked in parallel, rather than in sequence.

Regards,

Peter.



Re: Implications for Security Checks - SocketPermission, URL and DNS lookups

Posted by Peter Firmstone <ji...@zeus.net.au>.
I've been able to work around the SocketPermission equals and hashCode 
problems for DelegatePermission (which may contain a SocketPermission), 
by implementing equals and hashCode in the same way implemented by 
Object (reference == ), then using a static factory method to ensure 
there are no duplicates.  Permission forces you to re-implement equals 
and hashCode.  This enables a DelegatePermission containing a 
SocketPermission to be cached.  The speedup on repeat permission checks 
(20,000 calls) is a factor of 20.  SocketPermission check results are 
not cached by the SecurityManager, because of the broken equals and 
hashCode behaviour.

I've also been able to split the ProtectionDomain's contained in the 
AccessControlContext into separate permission checks using an 
ExecutorService, which reduces the time taken to perform network DNS 
lookups, since they execute in parallel, rather than series.

The Executor uses a formula to calculate the number of threads based on 
the available CPU's:

        double blocking_coefficient = 0.8; // 0 CPU intensive to 0.9 IO 
intensive
        int numberOfCores = Runtime.getRuntime().availableProcessors();
        int poolSize = (int) (numberOfCores / ( 1 - blocking_coefficient));

On my computer with 4 cpu's, the executor uses a maximum of 20 threads 
(this figure may require adjustment), it would calculate 160 threads on 
a 32 cpu box.

If your DNS doesn't support reverse lookup's, for whatever reason, this 
won't fix that problem.

I have decided to replace the use of CodeSource URL's with URI, in the 
policy implementation, to eliminate DNS lookups there.

I'm undecided about reimplementing SocketPermission and 
SocketPermissionCollection, it may be worth waiting to see if the 
concurrent policy and executor based security manager are sufficient to 
reduce the performance impact for most situations.

Downloaded proxy's that are successfully unmarshalled are automatically 
given static Permissions in the ProtectionDomain constructor, but this 
is a specific SocketPermission which requires a reverse DNS lookup, so 
to improve performance after proxy verification, dynamically add a wild 
card SocketPermission that won't require the reverse DNS lookup, the 
policy will be consulted first and will return faster.

Only use wildcard SocketPermission's in policy files, this avoids 
reverse DNS lookups, remove other types of SocketPermission's from 
policy files and replace with wild cards, this may of course require an 
authenticate Principal.

Change the default jvm policy file that contains localhost to use the IP 
127.0.0.1 instead.  I could actually do that in the policy parser I'm 
writing, seeing as it happens so often!

grant {
<SNIP>
        // allows anyone to listen on un-privileged ports
        // permission java.net.SocketPermission "localhost:1024-", "listen";
        permission java.net.SocketPermission "127.0.0.1:1024-", "listen";
<SNIP>
}

The PreferredClassProvider loader table that Chris was referring to with 
LookupLocatorDiscovery and Reggie scalability uses a synchronized map 
and URL's in it's key's, we could look into using URI based key's 
instead, this is another source of DNS lookups.

URL is a particularly bad class to use as a key, due to the equals and 
hashcode methods requiring DNS lookup.  And each key in the loader table 
uses an array of URL's, as in the key.

Cheers,

Peter.

Peter Firmstone wrote:
> DNS lookups and reverse lookups caused by URL and SocketPermission, 
> equals, hashCode and implies methods create some serious performance 
> problems for distributed programs.
>
> The concurrent policy implementation I've been working on reduces lock 
> contention between threads performing security checks.
>
> When the SecurityManager is used to check a guard, it calls the 
> AccessController, which retrieves the AccessControlContext from the 
> call stack, this contains all the ProtectionDomain's on the call stack 
> (I won't go into privileged calls here), if a ProtectionDomain is 
> dynamic it will consult the Policy, prior to checking the static 
> permissions it contains.
>
> The problem with the old policy implementation is lock contention 
> caused by multiple threads all using multiple ProtectionDomains, when 
> the time taken to perform a check is considerable, especially where 
> identical security checks might be performed by multiple threads 
> executing the same code.
>
> Although concurrent policy reduces contention between 
> ProtectionDomain's calls to Policy.implies, there remain some 
> fundamental problems with the implementations of SocketPermission and 
> URL, that cause unnecessary DNS lookups during equals(), hashCode() 
> and implies() methods.
>
> The following bugs concern SocketPermission (please read before 
> continuing) :
>
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6592285
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4975882 - contains 
> a lot of valuable comments.
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4671007 - fixed, 
> perhaps incorrectly.
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6501746
>
> Anyway to cut a long story short, DNS lookups and DNS reverse lookups 
> are performed for the equals and hashCode implementations in 
> SocketPermission and URL, with disastrous performance implications for 
> policy implementations using collections and caching security 
> permission check results.
> For example, once a SocketPermission guard has been checked for a 
> specific AccessContolContext the result is cached by my 
> SecurityManager, avoiding repeat security checks, however if that 
> cache contains SocketPermission, DNS lookups will be required, the 
> cache will perform slower than some other directly performed security 
> checks!  The cache is intended to return quickly to avoid reconsulting 
> every ProtectionDomain on the stack.
>
> To make matters worse, when checking a SocketPermission guard, the DNS 
> may be consulted for every non wild card SocketPermission contained 
> within a SocketPermissionCollection, up until it is implied.  DNS 
> checks are being made unnecessarily, since the wild card that matches 
> may not require a DNS lookup at all, but because the non matching 
> SocketPermission's are being checked first, the DNS lookups and 
> reverse lookups are still performed.  This could be fixed completely, 
> by moving the responsibility of DNS lookups from SocketPermission to 
> SocketPermissionCollection.
>
> The identity of two SocketPermission's are equal if they resolve to 
> the same IP address, but their hashCode's are different! See bug 6592623.
>
> The identity of a SocketPermission with an IP address and a DNS name, 
> resolving to identical IP address should not (in my opinion) be equal, 
> but is!  One SocketPermission should only imply the other while DNS 
> resolves to the same IP address, otherwise the equality of the two 
> SocketPermission's will change if the IP address is assigned to a 
> different domain!  Object equality / identity shouldn't depend on the 
> result of a possibly unreliable network source.
>
> SocketPermission and SocketPermissionCollection are broken, the only 
> solution I can think of is to re-implement these classes (from 
> Harmony) in the policy and SecurityManager, substituting the existing 
> jvm classes.  This would not be visible to client developers.
>
> SocketPermission's may also exist in a ProtectionDomain's static 
> Permissions, these would have to be converted by the policy when 
> merging the permissions from the ProtectionDomain with those from the 
> policy.  Since ProtectionDomain, attempts to check it's own internal 
> permissions, after the policy permission check fails, DNS checks are 
> currently performed by duplicate SocketPermission's residing in the 
> ProectionDomain, this will no longer occur, since the permission being 
> checked will be converted to say for argument sake 
> org.apache.river.security.SocketPermission.  However because some 
> ProtectionDomains are static, they never consult the policy, so the 
> Permission's contained in each ProtectionDomain will require 
> conversion also, to do so will require extending and implementing a 
> ProtectionDomain that encapsulates existing ProtectionDomain's in the 
> AccessControlContext, by utilising a DomainCombiner.
>
> For CodeSource grant's, the policy file based grant's are defined by 
> URL's, however URL's identity depend upon DNS record results, similar 
> to SocketPermission equals and hashCode implementations which we have 
> no control over.
>
> I'm thinking about implementing URI based grant's instead, to avoid 
> DNS lookups, then allowing a policy compatibility mode to be enabled 
> (with logging) for falling back to CodeSource grant's when a URL 
> cannot be converted to a URI, this is a much simpler fix than the 
> SocketPermission problem.
>
> For Dynamic Policy Grants, because ProtectionDomain doesn't override 
> equals (that's a good thing), the contained CodeSource must also be 
> checked, again potentially slowing down permission checks with DNS 
> lookups, simply because CodeSource uses URL's.  Changing the Dynamic 
> Grant's to use URI based comparison would be relatively simple, since 
> the URI is obtained dynamically when the dynamic grant is created.
>
> URI based grant's don't use DNS resolution and would have a narrower 
> scope of implied CodeSources, an IP based grant won't imply a DNS 
> domain URL based CodeSource and vice versa.  Rather than rely on DNS 
> resolution, grant's could be made specifically for IPv4, IPv6 and DNS 
> names in policy files.  URL.toURI() can be utilised to check if URI 
> grant's imply a CodeSource without resorting to DNS.
>
> Any thoughts, comments or ideas?
>
> N.B. It's sad that security is implemented the way it is, it would be 
> far better if it was Executor based, since every protection domain 
> could be checked in parallel, rather than in sequence.
>
> Regards,
>
> Peter.
>
>
>