You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomee.apache.org by David Blevins <da...@gmail.com> on 2011/03/24 22:36:59 UTC

Include+Exclude vs Exclude+Include

Started to type this in the previous email and wow did it get too long....

In general I'm still not sure what kind of properties we might want to use to configure all this.  Here's a sample of the xml I imagine:

An include based approach:

    <scanning>
    
      <includes>
    
        <package>org.superbiz</package>
        <package>org.wonderbiz</package>
        <class>com.techie.Widget</class>
    
        <exceptions>
          <package>org.superbiz.util</package>
          <class>com.superbiz.Foo</class>
          <pattern>.*Test</pattern>
        </exceptions>
    
      </includes>
    
    </scanning>

An exclude based approach:

    <scanning>

      <excludes>

        <pattern>org.*</pattern>

        <exceptions>
          <package>org.superbiz</package>
          <package>org.wonderbiz</package>
        </exceptions>

      </excludes>

    </scanning>

Let me get straight to the point and say I'm not sure an exclude based approach is useful.  Factually, whichever has the fewest rules will be faster.  Our current classpath filtering has a lot of built-in rules we turn on if you change the default settings, and is hence a little slow if you actually use it.


I copied the Include+Exclude vs Exclude+Include from HTTPD.  They call it Allow,Deny vs Deny,Allow.  The names and descriptions are not great and people clearly misunderstand them.  Here's how they have them documented:

  Ordering is one of:

    Allow,Deny   

      First, all Allow directives are evaluated; at least one must
      match, or the request is rejected. Next, all Deny directives are
      evaluated. If any matches, the request is rejected. Last, any
      requests which do not match an Allow or a Deny directive are
      denied by default.

    Deny,Allow

      First, all Deny directives are evaluated; if any match, the
      request is denied unless it also matches an Allow directive. Any
      requests which do not match any Allow or Deny directives are
      permitted.

The descriptions are OK enough, but the names imply the opposite in my brain.  They seem to conflict in my reading of them because there are actually THREE levels of things going on.  What the default behavior is is NOT in the name.  It should be something like:

    Default Deny, Allow, Deny   

      First, all Allow directives are evaluated....

    Default Allow, Deny, Allow

      First, all Deny directives are evaluated....

As a result of the default not being in the name and the topic being somewhat confusing, you can find tons of blog posts from people advising using "Deny,Allow" with a DenyAll rule.  It feels good to have deny listed first in your ordering and it feels great to *see* "Deny All" as rule, but what you've actually done is:

    Default Allow, Deny All rule, Allow rules

You just lost one of your three levels.  That very last level is meant to allow you to configure exceptions to your rules.  So I don't think they should be called "Deny,Allow" but simply referred to as they are: exceptions.  That gives you:

    Default Deny, Allow rules, Allow rule exceptions

      First, all Allow directives are evaluated....

    Default Allow, Deny rules, Deny rule exceptions

      First, all Deny directives are evaluated....

If people actually had to *see* Default Allow sitting next to Deny All, they'd probably be more inclined to read the docs better.


Anyway I'm not sure you ever really need both.  The most useful to our situation is likely Default Deny, Allow Rules, Allow Rule Exceptions.  Or in our terms, Default Exclude, Include Rules, Include Rule Exceptions.

Implementation-wise it's extremely easy to support both.  They're one or two lines of code different.   But...

Configuration and documentation wise, I'm not sure that we want to present it all or how to present it all.  If we do accept both we should definitely flag situations were people make the above mistake.


Open to any thoughts on how to expose the possibilities in both property and xml forms.


I'm somewhat leaning towards only ever showing and documenting an default deny, includes, include exceptions approach and just have the inverse (default allow, excludes, exclude exceptions) be something that's just there and we don't mention unless someone complains.

Technically speaking, whatever has the fewest rules will perform the best.


-David





Re: Include+Exclude vs Exclude+Include

Posted by Jean-Louis MONTEIRO <je...@gmail.com>.
I like grep example. It's easier to understand.

Regarding the file, we can also do as in WAS --> add a MANIFEST entry, but
after using it a bit, it seems more difficult to write and to maintain
without any maven plugin.

Jean-Louis

--
View this message in context: http://openejb.979440.n4.nabble.com/Include-Exclude-vs-Exclude-Include-tp3403891p3410934.html
Sent from the OpenEJB Dev mailing list archive at Nabble.com.

Re: Include+Exclude vs Exclude+Include

Posted by David Blevins <da...@gmail.com>.
On Mar 24, 2011, at 10:46 PM, Rniamo wrote:

> looking what we have today in standalone mode i think exceptions are
> useless: people can (should) split their applications into packages
> correctly IMHO. Managing classes and packages is (still IMHO) not
> interesting for the same reason. For me include/exclude pattern like in
> standalone should be enough: it is not a standard feature it is simply a
> helper feature, regexp should really be enough (often people will include
> easily jars with regexp or with few "or package pattern").

Definitely agree that a good regex easily trumps fixed strings (class, package).  I do like the exceptions though.  Often when I grep stuff I tend to do:

   grep [an already complicated expression] | grep -v [stuff i dont want but was too hard to work into my expression]

That second 'grep -v' is the exceptions or "excludes".  Now that I look at it in grep form, maybe plain include/exclude *is* the easiest to understand/describe.

  cat classes.txt | grep includes | grep -v excludes | scan

So maybe:

 openejb.deployments.classes.include
 openejb.deployments.classes.exclude

The exclude,include order is less easy to describe in unix terms.  But the java code is tiny, so we can show that.

> Another point but i'm less sure i understand what you want to do: do you
> want to add an xml for it? couldn't it be included in openejb.xml (in
> containers for example) or as context-param for webapp and system properties
> for standalone mode?

Right, a spec generic xml file applicable to any archive.  Optional of course.

Certainly it could be in all those places.  As well the CDI spec wants to add their own equivalent flags in their descriptor.  That's sort of the issue right there.  A webapp, for example, can be a EjbModule a JpaModule a CdiModule a JsfModule and of course a WebModule.  If everyone has their own include/exclude settings and these rules are "fighting" to define scanning for the same archive there are only a few ways to deal with it:

  1. Merge all the rules and scan for "stuff" just once.
  2. Rescan for each set of rules (possibly up to 5 times per webapp).

A simple xml at least stands the chance to add a little unity to the scanning for the app and can be used by any module.  Say META-INF/scan.xml.  Could even be used to say "don't scan me"


-David

> On Thu, Mar 24, 2011 at 10:36 PM, David Blevins <da...@gmail.com>wrote:
> 
>> Started to type this in the previous email and wow did it get too long....
>> 
>> In general I'm still not sure what kind of properties we might want to use
>> to configure all this.  Here's a sample of the xml I imagine:
>> 
>> An include based approach:
>> 
>>   <scanning>
>> 
>>     <includes>
>> 
>>       <package>org.superbiz</package>
>>       <package>org.wonderbiz</package>
>>       <class>com.techie.Widget</class>
>> 
>>       <exceptions>
>>         <package>org.superbiz.util</package>
>>         <class>com.superbiz.Foo</class>
>>         <pattern>.*Test</pattern>
>>       </exceptions>
>> 
>>     </includes>
>> 
>>   </scanning>
>> 
>> An exclude based approach:
>> 
>>   <scanning>
>> 
>>     <excludes>
>> 
>>       <pattern>org.*</pattern>
>> 
>>       <exceptions>
>>         <package>org.superbiz</package>
>>         <package>org.wonderbiz</package>
>>       </exceptions>
>> 
>>     </excludes>
>> 
>>   </scanning>
>> 
>> Let me get straight to the point and say I'm not sure an exclude based
>> approach is useful.  Factually, whichever has the fewest rules will be
>> faster.  Our current classpath filtering has a lot of built-in rules we turn
>> on if you change the default settings, and is hence a little slow if you
>> actually use it.
>> 
>> 
>> I copied the Include+Exclude vs Exclude+Include from HTTPD.  They call it
>> Allow,Deny vs Deny,Allow.  The names and descriptions are not great and
>> people clearly misunderstand them.  Here's how they have them documented:
>> 
>> Ordering is one of:
>> 
>>   Allow,Deny
>> 
>>     First, all Allow directives are evaluated; at least one must
>>     match, or the request is rejected. Next, all Deny directives are
>>     evaluated. If any matches, the request is rejected. Last, any
>>     requests which do not match an Allow or a Deny directive are
>>     denied by default.
>> 
>>   Deny,Allow
>> 
>>     First, all Deny directives are evaluated; if any match, the
>>     request is denied unless it also matches an Allow directive. Any
>>     requests which do not match any Allow or Deny directives are
>>     permitted.
>> 
>> The descriptions are OK enough, but the names imply the opposite in my
>> brain.  They seem to conflict in my reading of them because there are
>> actually THREE levels of things going on.  What the default behavior is is
>> NOT in the name.  It should be something like:
>> 
>>   Default Deny, Allow, Deny
>> 
>>     First, all Allow directives are evaluated....
>> 
>>   Default Allow, Deny, Allow
>> 
>>     First, all Deny directives are evaluated....
>> 
>> As a result of the default not being in the name and the topic being
>> somewhat confusing, you can find tons of blog posts from people advising
>> using "Deny,Allow" with a DenyAll rule.  It feels good to have deny listed
>> first in your ordering and it feels great to *see* "Deny All" as rule, but
>> what you've actually done is:
>> 
>>   Default Allow, Deny All rule, Allow rules
>> 
>> You just lost one of your three levels.  That very last level is meant to
>> allow you to configure exceptions to your rules.  So I don't think they
>> should be called "Deny,Allow" but simply referred to as they are:
>> exceptions.  That gives you:
>> 
>>   Default Deny, Allow rules, Allow rule exceptions
>> 
>>     First, all Allow directives are evaluated....
>> 
>>   Default Allow, Deny rules, Deny rule exceptions
>> 
>>     First, all Deny directives are evaluated....
>> 
>> If people actually had to *see* Default Allow sitting next to Deny All,
>> they'd probably be more inclined to read the docs better.
>> 
>> 
>> Anyway I'm not sure you ever really need both.  The most useful to our
>> situation is likely Default Deny, Allow Rules, Allow Rule Exceptions.  Or in
>> our terms, Default Exclude, Include Rules, Include Rule Exceptions.
>> 
>> Implementation-wise it's extremely easy to support both.  They're one or
>> two lines of code different.   But...
>> 
>> Configuration and documentation wise, I'm not sure that we want to present
>> it all or how to present it all.  If we do accept both we should definitely
>> flag situations were people make the above mistake.
>> 
>> 
>> Open to any thoughts on how to expose the possibilities in both property
>> and xml forms.
>> 
>> 
>> I'm somewhat leaning towards only ever showing and documenting an default
>> deny, includes, include exceptions approach and just have the inverse
>> (default allow, excludes, exclude exceptions) be something that's just there
>> and we don't mention unless someone complains.
>> 
>> Technically speaking, whatever has the fewest rules will perform the best.
>> 
>> 
>> -David
>> 
>> 
>> 
>> 
>> 


Re: Include+Exclude vs Exclude+Include

Posted by Rniamo <rn...@gmail.com>.
looking what we have today in standalone mode i think exceptions are
useless: people can (should) split their applications into packages
correctly IMHO. Managing classes and packages is (still IMHO) not
interesting for the same reason. For me include/exclude pattern like in
standalone should be enough: it is not a standard feature it is simply a
helper feature, regexp should really be enough (often people will include
easily jars with regexp or with few "or package pattern").

Another point but i'm less sure i understand what you want to do: do you
want to add an xml for it? couldn't it be included in openejb.xml (in
containers for example) or as context-param for webapp and system properties
for standalone mode?

- Rniamo


On Thu, Mar 24, 2011 at 10:36 PM, David Blevins <da...@gmail.com>wrote:

> Started to type this in the previous email and wow did it get too long....
>
> In general I'm still not sure what kind of properties we might want to use
> to configure all this.  Here's a sample of the xml I imagine:
>
> An include based approach:
>
>    <scanning>
>
>      <includes>
>
>        <package>org.superbiz</package>
>        <package>org.wonderbiz</package>
>        <class>com.techie.Widget</class>
>
>        <exceptions>
>          <package>org.superbiz.util</package>
>          <class>com.superbiz.Foo</class>
>          <pattern>.*Test</pattern>
>        </exceptions>
>
>      </includes>
>
>    </scanning>
>
> An exclude based approach:
>
>    <scanning>
>
>      <excludes>
>
>        <pattern>org.*</pattern>
>
>        <exceptions>
>          <package>org.superbiz</package>
>          <package>org.wonderbiz</package>
>        </exceptions>
>
>      </excludes>
>
>    </scanning>
>
> Let me get straight to the point and say I'm not sure an exclude based
> approach is useful.  Factually, whichever has the fewest rules will be
> faster.  Our current classpath filtering has a lot of built-in rules we turn
> on if you change the default settings, and is hence a little slow if you
> actually use it.
>
>
> I copied the Include+Exclude vs Exclude+Include from HTTPD.  They call it
> Allow,Deny vs Deny,Allow.  The names and descriptions are not great and
> people clearly misunderstand them.  Here's how they have them documented:
>
>  Ordering is one of:
>
>    Allow,Deny
>
>      First, all Allow directives are evaluated; at least one must
>      match, or the request is rejected. Next, all Deny directives are
>      evaluated. If any matches, the request is rejected. Last, any
>      requests which do not match an Allow or a Deny directive are
>      denied by default.
>
>    Deny,Allow
>
>      First, all Deny directives are evaluated; if any match, the
>      request is denied unless it also matches an Allow directive. Any
>      requests which do not match any Allow or Deny directives are
>      permitted.
>
> The descriptions are OK enough, but the names imply the opposite in my
> brain.  They seem to conflict in my reading of them because there are
> actually THREE levels of things going on.  What the default behavior is is
> NOT in the name.  It should be something like:
>
>    Default Deny, Allow, Deny
>
>      First, all Allow directives are evaluated....
>
>    Default Allow, Deny, Allow
>
>      First, all Deny directives are evaluated....
>
> As a result of the default not being in the name and the topic being
> somewhat confusing, you can find tons of blog posts from people advising
> using "Deny,Allow" with a DenyAll rule.  It feels good to have deny listed
> first in your ordering and it feels great to *see* "Deny All" as rule, but
> what you've actually done is:
>
>    Default Allow, Deny All rule, Allow rules
>
> You just lost one of your three levels.  That very last level is meant to
> allow you to configure exceptions to your rules.  So I don't think they
> should be called "Deny,Allow" but simply referred to as they are:
> exceptions.  That gives you:
>
>    Default Deny, Allow rules, Allow rule exceptions
>
>      First, all Allow directives are evaluated....
>
>    Default Allow, Deny rules, Deny rule exceptions
>
>      First, all Deny directives are evaluated....
>
> If people actually had to *see* Default Allow sitting next to Deny All,
> they'd probably be more inclined to read the docs better.
>
>
> Anyway I'm not sure you ever really need both.  The most useful to our
> situation is likely Default Deny, Allow Rules, Allow Rule Exceptions.  Or in
> our terms, Default Exclude, Include Rules, Include Rule Exceptions.
>
> Implementation-wise it's extremely easy to support both.  They're one or
> two lines of code different.   But...
>
> Configuration and documentation wise, I'm not sure that we want to present
> it all or how to present it all.  If we do accept both we should definitely
> flag situations were people make the above mistake.
>
>
> Open to any thoughts on how to expose the possibilities in both property
> and xml forms.
>
>
> I'm somewhat leaning towards only ever showing and documenting an default
> deny, includes, include exceptions approach and just have the inverse
> (default allow, excludes, exclude exceptions) be something that's just there
> and we don't mention unless someone complains.
>
> Technically speaking, whatever has the fewest rules will perform the best.
>
>
> -David
>
>
>
>
>