You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "Knut Anders Hatlen (JIRA)" <ji...@apache.org> on 2009/10/21 13:41:59 UTC

[jira] Created: (DERBY-4416) Handle comparison of two constants as a boolean constant

Handle comparison of two constants as a boolean constant
--------------------------------------------------------

                 Key: DERBY-4416
                 URL: https://issues.apache.org/jira/browse/DERBY-4416
             Project: Derby
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 10.5.3.0
            Reporter: Knut Anders Hatlen
            Priority: Minor


In the lack of the boolean data type, Derby forces you to use expressions like 1=1 and 1<>1 to express true and false. Generated SQL statements also tend to use such expressions, and so does Derby in its own meta-data queries.

Derby has many useful optimizations for boolean true/false. For instance, ProjectRestrictNode and PredicateList are able to eliminate predicates, and in some cases the entire ProjectRestrictNode, if the predicate contains constant true or false values. However, during parsing and compilation, expressions like 1=1 are not rewritten to TRUE, and we don't get any benefit from the boolean optimization code. This leads to more complex, and possibly less efficient, byte code being generated for the statements.

Also, boolean constants are assigned a selectivity of 0.0 (false) or 1.0 (true), since they will always match no rows when false and all rows when true. The expression 1=1 does however get it's selectivity from the = operator, which means that it'll be 0.1. The same selectivity is assigned to 1=0. Other operators have different selectivity, so 2<3 has the selectivity 0.33, even though the actual selectivity of the expression is the same as 1=1 and TRUE, namely 1.0.

This leads to oddities like the optimizer choosing a different plan when you change 2<3 to 1=1 in a WHERE clause. See http://mail-archives.apache.org/mod_mbox/db-derby-user/200909.mbox/%3c25531166.post@talk.nabble.com%3e for an example of that.

If we could go through the query tree and replace occurrences of comparisons between constant values with a boolean constant at bind time, such queries would end up with simpler byte code, and the selectivity passed to the optimizer would be more accurate, possibly resulting in a better plan being chosen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (DERBY-4416) Handle comparison of two constants as a boolean constant

Posted by "Knut Anders Hatlen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-4416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Knut Anders Hatlen updated DERBY-4416:
--------------------------------------

    Attachment: d4416-1a.stat
                d4416-1a.diff

Here's an updated patch with the visitor renamed to ConstantExpressionVisitor, the method in ValueNode renamed to evaluateConstantExpression(), and with updated comments.

Description of the other changes from the previous patch:

1) The invocation of the visitor is moved to DMLStatementNode.optimizeStatement(). The previous patch called it from SelectNode and JoinNode on WHERE/HAVING/ON clauses. As mentioned by Bryan, this transformation can be useful in other parts of the tree as well (as seen in VALUES 1+3, although evaluation of arithmetic operations hasn't implemented). I think it's cleaner to have the invocation at one single location. Also, since there might be sub-queries inside WHERE clauses, the old patch might traverse some parts of the tree many times.

The visitor now walks the entire query tree between the preprocessing phase and the optimization phase. Doing it as late as possible is advantageous because then other simplifications (like the NOT elimination performed in preprocess()) increase the chances of finding expressions with a know value, but it should be performed before optimization so that the optimizer can take advantage of the more accurate selectivity predictions.

2) JoinNode: Made acceptChildren() call accept() on joinPredicates. The previous patch visited the JoinNode a little earlier, when the ON clause was still represented by joinClause, whereas this patch visits the JoinNode after the predicates have been moved to joinPredicates. There are also other children of JoinNode that are not visited, but I didn't do anything with them for now.

3) BinaryRelationalOperatorNode: Added rewriting for less-than and less-equals, which were forgotten in the previous patch.

4) logop.out: Accept that a statement that used to fail with arithmetic overflow now succeeds because the arithmetic isn't performed at runtime. The statement that fails is
> select x from s where 2147483647 + 10 = 2 and (1=2);
whereas this almost identical one is expected to succeed
> select x from s where (1=2) and 2147483647 + 10 = 2;
I'm not aware of anything in the SQL standard that requires us to evaluate the arithmetic expression first, so I think it's equally OK to pass the first statement as it is to pass the second statement. The comments in the test also indicate this.

4) specjPlans.out: Removed unneeded ProjectRestrictResultSet from two of the query plans (they were there to enforce the restriction 1=1).

5) outerjoin.out: Updated query plan from nested loop join to hash join in a join with 1=1 in the ON clause. New plan looks OK and the results are still correct. The plan probably changed because 1=1 and TRUE have different (estimated) selectivity.

I haven't run all the regression tests on the latest revision of the patch. Will report back when I have the results.

> Handle comparison of two constants as a boolean constant
> --------------------------------------------------------
>
>                 Key: DERBY-4416
>                 URL: https://issues.apache.org/jira/browse/DERBY-4416
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 10.5.3.0
>            Reporter: Knut Anders Hatlen
>            Assignee: Knut Anders Hatlen
>            Priority: Minor
>         Attachments: d4416-1a.diff, d4416-1a.stat, replaceExpressions.diff
>
>
> In the lack of the boolean data type, Derby forces you to use expressions like 1=1 and 1<>1 to express true and false. Generated SQL statements also tend to use such expressions, and so does Derby in its own meta-data queries.
> Derby has many useful optimizations for boolean true/false. For instance, ProjectRestrictNode and PredicateList are able to eliminate predicates, and in some cases the entire ProjectRestrictNode, if the predicate contains constant true or false values. However, during parsing and compilation, expressions like 1=1 are not rewritten to TRUE, and we don't get any benefit from the boolean optimization code. This leads to more complex, and possibly less efficient, byte code being generated for the statements.
> Also, boolean constants are assigned a selectivity of 0.0 (false) or 1.0 (true), since they will always match no rows when false and all rows when true. The expression 1=1 does however get it's selectivity from the = operator, which means that it'll be 0.1. The same selectivity is assigned to 1=0. Other operators have different selectivity, so 2<3 has the selectivity 0.33, even though the actual selectivity of the expression is the same as 1=1 and TRUE, namely 1.0.
> This leads to oddities like the optimizer choosing a different plan when you change 2<3 to 1=1 in a WHERE clause. See http://mail-archives.apache.org/mod_mbox/db-derby-user/200909.mbox/%3c25531166.post@talk.nabble.com%3e for an example of that.
> If we could go through the query tree and replace occurrences of comparisons between constant values with a boolean constant at bind time, such queries would end up with simpler byte code, and the selectivity passed to the optimizer would be more accurate, possibly resulting in a better plan being chosen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (DERBY-4416) Handle comparison of two constants as a boolean constant

Posted by "Knut Anders Hatlen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-4416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12768274#action_12768274 ] 

Knut Anders Hatlen commented on DERBY-4416:
-------------------------------------------

Yes, that's the idea. Or 3+1<4  --> 4<4 ---> false. I'm not sure if we want to do it, but I thought that since it would be possible using the same mechanism, there's no reason to have "boolean" in the method/class names.

> Handle comparison of two constants as a boolean constant
> --------------------------------------------------------
>
>                 Key: DERBY-4416
>                 URL: https://issues.apache.org/jira/browse/DERBY-4416
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 10.5.3.0
>            Reporter: Knut Anders Hatlen
>            Priority: Minor
>         Attachments: replaceExpressions.diff
>
>
> In the lack of the boolean data type, Derby forces you to use expressions like 1=1 and 1<>1 to express true and false. Generated SQL statements also tend to use such expressions, and so does Derby in its own meta-data queries.
> Derby has many useful optimizations for boolean true/false. For instance, ProjectRestrictNode and PredicateList are able to eliminate predicates, and in some cases the entire ProjectRestrictNode, if the predicate contains constant true or false values. However, during parsing and compilation, expressions like 1=1 are not rewritten to TRUE, and we don't get any benefit from the boolean optimization code. This leads to more complex, and possibly less efficient, byte code being generated for the statements.
> Also, boolean constants are assigned a selectivity of 0.0 (false) or 1.0 (true), since they will always match no rows when false and all rows when true. The expression 1=1 does however get it's selectivity from the = operator, which means that it'll be 0.1. The same selectivity is assigned to 1=0. Other operators have different selectivity, so 2<3 has the selectivity 0.33, even though the actual selectivity of the expression is the same as 1=1 and TRUE, namely 1.0.
> This leads to oddities like the optimizer choosing a different plan when you change 2<3 to 1=1 in a WHERE clause. See http://mail-archives.apache.org/mod_mbox/db-derby-user/200909.mbox/%3c25531166.post@talk.nabble.com%3e for an example of that.
> If we could go through the query tree and replace occurrences of comparisons between constant values with a boolean constant at bind time, such queries would end up with simpler byte code, and the selectivity passed to the optimizer would be more accurate, possibly resulting in a better plan being chosen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (DERBY-4416) Handle comparison of two constants as a boolean constant

Posted by "Bryan Pendleton (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-4416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12768266#action_12768266 ] 

Bryan Pendleton commented on DERBY-4416:
----------------------------------------

I'm not sure I understand your more-general idea.

Would it mean that, e.g.,

    select 1+2 from t

would be transformed into

   select 3 from t


> Handle comparison of two constants as a boolean constant
> --------------------------------------------------------
>
>                 Key: DERBY-4416
>                 URL: https://issues.apache.org/jira/browse/DERBY-4416
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 10.5.3.0
>            Reporter: Knut Anders Hatlen
>            Priority: Minor
>         Attachments: replaceExpressions.diff
>
>
> In the lack of the boolean data type, Derby forces you to use expressions like 1=1 and 1<>1 to express true and false. Generated SQL statements also tend to use such expressions, and so does Derby in its own meta-data queries.
> Derby has many useful optimizations for boolean true/false. For instance, ProjectRestrictNode and PredicateList are able to eliminate predicates, and in some cases the entire ProjectRestrictNode, if the predicate contains constant true or false values. However, during parsing and compilation, expressions like 1=1 are not rewritten to TRUE, and we don't get any benefit from the boolean optimization code. This leads to more complex, and possibly less efficient, byte code being generated for the statements.
> Also, boolean constants are assigned a selectivity of 0.0 (false) or 1.0 (true), since they will always match no rows when false and all rows when true. The expression 1=1 does however get it's selectivity from the = operator, which means that it'll be 0.1. The same selectivity is assigned to 1=0. Other operators have different selectivity, so 2<3 has the selectivity 0.33, even though the actual selectivity of the expression is the same as 1=1 and TRUE, namely 1.0.
> This leads to oddities like the optimizer choosing a different plan when you change 2<3 to 1=1 in a WHERE clause. See http://mail-archives.apache.org/mod_mbox/db-derby-user/200909.mbox/%3c25531166.post@talk.nabble.com%3e for an example of that.
> If we could go through the query tree and replace occurrences of comparisons between constant values with a boolean constant at bind time, such queries would end up with simpler byte code, and the selectivity passed to the optimizer would be more accurate, possibly resulting in a better plan being chosen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (DERBY-4416) Handle comparison of two constants as a boolean constant

Posted by "Knut Anders Hatlen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-4416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778370#action_12778370 ] 

Knut Anders Hatlen commented on DERBY-4416:
-------------------------------------------

Thanks, Dag! I changed the wording to this in revision 880791:

     * @return a node representing a Boolean constant if the result of the
     * operator is known; otherwise, this operator node

> Handle comparison of two constants as a boolean constant
> --------------------------------------------------------
>
>                 Key: DERBY-4416
>                 URL: https://issues.apache.org/jira/browse/DERBY-4416
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 10.5.3.0
>            Reporter: Knut Anders Hatlen
>            Assignee: Knut Anders Hatlen
>            Priority: Minor
>             Fix For: 10.6.0.0
>
>         Attachments: d4416-1a.diff, d4416-1a.stat, replaceExpressions.diff
>
>
> In the lack of the boolean data type, Derby forces you to use expressions like 1=1 and 1<>1 to express true and false. Generated SQL statements also tend to use such expressions, and so does Derby in its own meta-data queries.
> Derby has many useful optimizations for boolean true/false. For instance, ProjectRestrictNode and PredicateList are able to eliminate predicates, and in some cases the entire ProjectRestrictNode, if the predicate contains constant true or false values. However, during parsing and compilation, expressions like 1=1 are not rewritten to TRUE, and we don't get any benefit from the boolean optimization code. This leads to more complex, and possibly less efficient, byte code being generated for the statements.
> Also, boolean constants are assigned a selectivity of 0.0 (false) or 1.0 (true), since they will always match no rows when false and all rows when true. The expression 1=1 does however get it's selectivity from the = operator, which means that it'll be 0.1. The same selectivity is assigned to 1=0. Other operators have different selectivity, so 2<3 has the selectivity 0.33, even though the actual selectivity of the expression is the same as 1=1 and TRUE, namely 1.0.
> This leads to oddities like the optimizer choosing a different plan when you change 2<3 to 1=1 in a WHERE clause. See http://mail-archives.apache.org/mod_mbox/db-derby-user/200909.mbox/%3c25531166.post@talk.nabble.com%3e for an example of that.
> If we could go through the query tree and replace occurrences of comparisons between constant values with a boolean constant at bind time, such queries would end up with simpler byte code, and the selectivity passed to the optimizer would be more accurate, possibly resulting in a better plan being chosen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (DERBY-4416) Handle comparison of two constants as a boolean constant

Posted by "Knut Anders Hatlen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-4416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Knut Anders Hatlen updated DERBY-4416:
--------------------------------------

    Attachment: replaceExpressions.diff

Attached is a patch (not for commit) that I've experimented with. It adds a visitor that replaces (some) expressions that are guaranteed to evaluate to true or false with a boolean constant node. This visitor is used on the WHERE clauses and HAVING clauses from SelectNode.normExpressions() and on the ON clauses in JoinNode.normExpressions().

I've seen that for instance this statement

  SELECT * FROM T WHERE 1=1

is translated into a ProjectRestrictResultSet on top of a table scan without the patch, and simply to a table scan with the patch.

I tested it with suites.All and derbyall, and saw three failures:

1) lang/outerjoin.sql: Changed query plans. Looks like the expected changes, in that some ProjectRestrictResultSets are eliminated, but will have to check to be sure.

2) lang/specjPlans.sql: Changed query plans. These also look benign, but will have to check.

3) lang/logop.sql: Query that previously failed now succeeds and returns no rows. Expected output is below, and judging by the comment in the test, it looks as if the empty result that was actually supposed to be the expected one. The reason why it stops failing, is that (1=2) is rewritten to false and PredicateList.restorePredicates() is therefore able eliminate the other half of the AND node (2147483647 + 10 = 2) that causes integer overflow.

ij> -- ... and false and ... should get resolved to false
select x from s where 2147483647 + 10 = 2 and (1=2);
ERROR 22003: The resulting value is outside the range for the data type INTEGER.

Now:

ij> -- ... and false and ... should get resolved to false
select x from s where 2147483647 + 10 = 2 and (1=2);
X          
-----------

> Handle comparison of two constants as a boolean constant
> --------------------------------------------------------
>
>                 Key: DERBY-4416
>                 URL: https://issues.apache.org/jira/browse/DERBY-4416
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 10.5.3.0
>            Reporter: Knut Anders Hatlen
>            Priority: Minor
>         Attachments: replaceExpressions.diff
>
>
> In the lack of the boolean data type, Derby forces you to use expressions like 1=1 and 1<>1 to express true and false. Generated SQL statements also tend to use such expressions, and so does Derby in its own meta-data queries.
> Derby has many useful optimizations for boolean true/false. For instance, ProjectRestrictNode and PredicateList are able to eliminate predicates, and in some cases the entire ProjectRestrictNode, if the predicate contains constant true or false values. However, during parsing and compilation, expressions like 1=1 are not rewritten to TRUE, and we don't get any benefit from the boolean optimization code. This leads to more complex, and possibly less efficient, byte code being generated for the statements.
> Also, boolean constants are assigned a selectivity of 0.0 (false) or 1.0 (true), since they will always match no rows when false and all rows when true. The expression 1=1 does however get it's selectivity from the = operator, which means that it'll be 0.1. The same selectivity is assigned to 1=0. Other operators have different selectivity, so 2<3 has the selectivity 0.33, even though the actual selectivity of the expression is the same as 1=1 and TRUE, namely 1.0.
> This leads to oddities like the optimizer choosing a different plan when you change 2<3 to 1=1 in a WHERE clause. See http://mail-archives.apache.org/mod_mbox/db-derby-user/200909.mbox/%3c25531166.post@talk.nabble.com%3e for an example of that.
> If we could go through the query tree and replace occurrences of comparisons between constant values with a boolean constant at bind time, such queries would end up with simpler byte code, and the selectivity passed to the optimizer would be more accurate, possibly resulting in a better plan being chosen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Closed: (DERBY-4416) Handle comparison of two constants as a boolean constant

Posted by "Knut Anders Hatlen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-4416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Knut Anders Hatlen closed DERBY-4416.
-------------------------------------


> Handle comparison of two constants as a boolean constant
> --------------------------------------------------------
>
>                 Key: DERBY-4416
>                 URL: https://issues.apache.org/jira/browse/DERBY-4416
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 10.5.3.0
>            Reporter: Knut Anders Hatlen
>            Assignee: Knut Anders Hatlen
>            Priority: Minor
>             Fix For: 10.6.0.0
>
>         Attachments: d4416-1a.diff, d4416-1a.stat, replaceExpressions.diff
>
>
> In the lack of the boolean data type, Derby forces you to use expressions like 1=1 and 1<>1 to express true and false. Generated SQL statements also tend to use such expressions, and so does Derby in its own meta-data queries.
> Derby has many useful optimizations for boolean true/false. For instance, ProjectRestrictNode and PredicateList are able to eliminate predicates, and in some cases the entire ProjectRestrictNode, if the predicate contains constant true or false values. However, during parsing and compilation, expressions like 1=1 are not rewritten to TRUE, and we don't get any benefit from the boolean optimization code. This leads to more complex, and possibly less efficient, byte code being generated for the statements.
> Also, boolean constants are assigned a selectivity of 0.0 (false) or 1.0 (true), since they will always match no rows when false and all rows when true. The expression 1=1 does however get it's selectivity from the = operator, which means that it'll be 0.1. The same selectivity is assigned to 1=0. Other operators have different selectivity, so 2<3 has the selectivity 0.33, even though the actual selectivity of the expression is the same as 1=1 and TRUE, namely 1.0.
> This leads to oddities like the optimizer choosing a different plan when you change 2<3 to 1=1 in a WHERE clause. See http://mail-archives.apache.org/mod_mbox/db-derby-user/200909.mbox/%3c25531166.post@talk.nabble.com%3e for an example of that.
> If we could go through the query tree and replace occurrences of comparisons between constant values with a boolean constant at bind time, such queries would end up with simpler byte code, and the selectivity passed to the optimizer would be more accurate, possibly resulting in a better plan being chosen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (DERBY-4416) Handle comparison of two constants as a boolean constant

Posted by "Knut Anders Hatlen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-4416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12768213#action_12768213 ] 

Knut Anders Hatlen commented on DERBY-4416:
-------------------------------------------

The visitor and the methods in the patch could probably have more general names, since they could be used to all sorts of expression-to-constant transformations.

So instead of the ValueNode.getBooleanEquivalent() method in the current patch, we could have a ValueNode.getConstantEquivalent(). The override in BinaryRelationalOperatorNode would return a boolean constant like in the patch. In addition, possibly as part of separate issues later, overrides could be added to other nodes, like in BinaryArithmeticOperatorNode to return the result of arithmetic operations as a (non-boolean) constant.

Currently, since the accept() methods call visit() on the parent node before the children, the patch is not able to fully rewrite expressions like ((1=1) = (3=3)) to true (well, this expression was probably disallowed by the recent DERBY-887 commit, but it will probably be allowed again when the boolean data type is reintroduced). The rewriting stops at (true = true). Making BinaryRelationalOperatorNode.getBooleanEquivalent() call getBooleanEquivalent() on the left operand and the right operand before checking whether they are constant nodes, should fix this, I think.

Adding an override in IsNullNode to handle IS NULL/IS NOT NULL should be fairly trivial too, I think.

> Handle comparison of two constants as a boolean constant
> --------------------------------------------------------
>
>                 Key: DERBY-4416
>                 URL: https://issues.apache.org/jira/browse/DERBY-4416
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 10.5.3.0
>            Reporter: Knut Anders Hatlen
>            Priority: Minor
>         Attachments: replaceExpressions.diff
>
>
> In the lack of the boolean data type, Derby forces you to use expressions like 1=1 and 1<>1 to express true and false. Generated SQL statements also tend to use such expressions, and so does Derby in its own meta-data queries.
> Derby has many useful optimizations for boolean true/false. For instance, ProjectRestrictNode and PredicateList are able to eliminate predicates, and in some cases the entire ProjectRestrictNode, if the predicate contains constant true or false values. However, during parsing and compilation, expressions like 1=1 are not rewritten to TRUE, and we don't get any benefit from the boolean optimization code. This leads to more complex, and possibly less efficient, byte code being generated for the statements.
> Also, boolean constants are assigned a selectivity of 0.0 (false) or 1.0 (true), since they will always match no rows when false and all rows when true. The expression 1=1 does however get it's selectivity from the = operator, which means that it'll be 0.1. The same selectivity is assigned to 1=0. Other operators have different selectivity, so 2<3 has the selectivity 0.33, even though the actual selectivity of the expression is the same as 1=1 and TRUE, namely 1.0.
> This leads to oddities like the optimizer choosing a different plan when you change 2<3 to 1=1 in a WHERE clause. See http://mail-archives.apache.org/mod_mbox/db-derby-user/200909.mbox/%3c25531166.post@talk.nabble.com%3e for an example of that.
> If we could go through the query tree and replace occurrences of comparisons between constant values with a boolean constant at bind time, such queries would end up with simpler byte code, and the selectivity passed to the optimizer would be more accurate, possibly resulting in a better plan being chosen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (DERBY-4416) Handle comparison of two constants as a boolean constant

Posted by "Knut Anders Hatlen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-4416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Knut Anders Hatlen resolved DERBY-4416.
---------------------------------------

          Resolution: Fixed
       Fix Version/s: 10.6.0.0
    Issue & fix info:   (was: [Patch Available])

Committed revision 880693.

I'm marking the issue as resolved. If there are other expressions that we want to rewrite using this mechanism, we can file separate issues for those improvements.

> Handle comparison of two constants as a boolean constant
> --------------------------------------------------------
>
>                 Key: DERBY-4416
>                 URL: https://issues.apache.org/jira/browse/DERBY-4416
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 10.5.3.0
>            Reporter: Knut Anders Hatlen
>            Assignee: Knut Anders Hatlen
>            Priority: Minor
>             Fix For: 10.6.0.0
>
>         Attachments: d4416-1a.diff, d4416-1a.stat, replaceExpressions.diff
>
>
> In the lack of the boolean data type, Derby forces you to use expressions like 1=1 and 1<>1 to express true and false. Generated SQL statements also tend to use such expressions, and so does Derby in its own meta-data queries.
> Derby has many useful optimizations for boolean true/false. For instance, ProjectRestrictNode and PredicateList are able to eliminate predicates, and in some cases the entire ProjectRestrictNode, if the predicate contains constant true or false values. However, during parsing and compilation, expressions like 1=1 are not rewritten to TRUE, and we don't get any benefit from the boolean optimization code. This leads to more complex, and possibly less efficient, byte code being generated for the statements.
> Also, boolean constants are assigned a selectivity of 0.0 (false) or 1.0 (true), since they will always match no rows when false and all rows when true. The expression 1=1 does however get it's selectivity from the = operator, which means that it'll be 0.1. The same selectivity is assigned to 1=0. Other operators have different selectivity, so 2<3 has the selectivity 0.33, even though the actual selectivity of the expression is the same as 1=1 and TRUE, namely 1.0.
> This leads to oddities like the optimizer choosing a different plan when you change 2<3 to 1=1 in a WHERE clause. See http://mail-archives.apache.org/mod_mbox/db-derby-user/200909.mbox/%3c25531166.post@talk.nabble.com%3e for an example of that.
> If we could go through the query tree and replace occurrences of comparisons between constant values with a boolean constant at bind time, such queries would end up with simpler byte code, and the selectivity passed to the optimizer would be more accurate, possibly resulting in a better plan being chosen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (DERBY-4416) Handle comparison of two constants as a boolean constant

Posted by "Knut Anders Hatlen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-4416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Knut Anders Hatlen updated DERBY-4416:
--------------------------------------

    Issue & fix info: [Patch Available]

All the regression tests ran cleanly.

> Handle comparison of two constants as a boolean constant
> --------------------------------------------------------
>
>                 Key: DERBY-4416
>                 URL: https://issues.apache.org/jira/browse/DERBY-4416
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 10.5.3.0
>            Reporter: Knut Anders Hatlen
>            Assignee: Knut Anders Hatlen
>            Priority: Minor
>         Attachments: d4416-1a.diff, d4416-1a.stat, replaceExpressions.diff
>
>
> In the lack of the boolean data type, Derby forces you to use expressions like 1=1 and 1<>1 to express true and false. Generated SQL statements also tend to use such expressions, and so does Derby in its own meta-data queries.
> Derby has many useful optimizations for boolean true/false. For instance, ProjectRestrictNode and PredicateList are able to eliminate predicates, and in some cases the entire ProjectRestrictNode, if the predicate contains constant true or false values. However, during parsing and compilation, expressions like 1=1 are not rewritten to TRUE, and we don't get any benefit from the boolean optimization code. This leads to more complex, and possibly less efficient, byte code being generated for the statements.
> Also, boolean constants are assigned a selectivity of 0.0 (false) or 1.0 (true), since they will always match no rows when false and all rows when true. The expression 1=1 does however get it's selectivity from the = operator, which means that it'll be 0.1. The same selectivity is assigned to 1=0. Other operators have different selectivity, so 2<3 has the selectivity 0.33, even though the actual selectivity of the expression is the same as 1=1 and TRUE, namely 1.0.
> This leads to oddities like the optimizer choosing a different plan when you change 2<3 to 1=1 in a WHERE clause. See http://mail-archives.apache.org/mod_mbox/db-derby-user/200909.mbox/%3c25531166.post@talk.nabble.com%3e for an example of that.
> If we could go through the query tree and replace occurrences of comparisons between constant values with a boolean constant at bind time, such queries would end up with simpler byte code, and the selectivity passed to the optimizer would be more accurate, possibly resulting in a better plan being chosen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (DERBY-4416) Handle comparison of two constants as a boolean constant

Posted by "Dag H. Wanvik (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-4416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778356#action_12778356 ] 

Dag H. Wanvik commented on DERBY-4416:
--------------------------------------

Thanks for this patch, Knut. The placement of accept for the new visitor in SMLStatementNode looks good.
+1.

Nit: in BinaryRelationalOperatorNode#evaluateConstantExpressions's
Javadoc, the text in the @return tag is slightly misleading:

@return a Boolean constant if the result of the operator is known, or
        the operator node otherwise

One could be led to believe is is returning a Boolean object; I like better the phrasing you use for
#newBool: "@return a node representing a Boolean constant".




> Handle comparison of two constants as a boolean constant
> --------------------------------------------------------
>
>                 Key: DERBY-4416
>                 URL: https://issues.apache.org/jira/browse/DERBY-4416
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 10.5.3.0
>            Reporter: Knut Anders Hatlen
>            Assignee: Knut Anders Hatlen
>            Priority: Minor
>             Fix For: 10.6.0.0
>
>         Attachments: d4416-1a.diff, d4416-1a.stat, replaceExpressions.diff
>
>
> In the lack of the boolean data type, Derby forces you to use expressions like 1=1 and 1<>1 to express true and false. Generated SQL statements also tend to use such expressions, and so does Derby in its own meta-data queries.
> Derby has many useful optimizations for boolean true/false. For instance, ProjectRestrictNode and PredicateList are able to eliminate predicates, and in some cases the entire ProjectRestrictNode, if the predicate contains constant true or false values. However, during parsing and compilation, expressions like 1=1 are not rewritten to TRUE, and we don't get any benefit from the boolean optimization code. This leads to more complex, and possibly less efficient, byte code being generated for the statements.
> Also, boolean constants are assigned a selectivity of 0.0 (false) or 1.0 (true), since they will always match no rows when false and all rows when true. The expression 1=1 does however get it's selectivity from the = operator, which means that it'll be 0.1. The same selectivity is assigned to 1=0. Other operators have different selectivity, so 2<3 has the selectivity 0.33, even though the actual selectivity of the expression is the same as 1=1 and TRUE, namely 1.0.
> This leads to oddities like the optimizer choosing a different plan when you change 2<3 to 1=1 in a WHERE clause. See http://mail-archives.apache.org/mod_mbox/db-derby-user/200909.mbox/%3c25531166.post@talk.nabble.com%3e for an example of that.
> If we could go through the query tree and replace occurrences of comparisons between constant values with a boolean constant at bind time, such queries would end up with simpler byte code, and the selectivity passed to the optimizer would be more accurate, possibly resulting in a better plan being chosen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (DERBY-4416) Handle comparison of two constants as a boolean constant

Posted by "Knut Anders Hatlen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-4416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Knut Anders Hatlen reassigned DERBY-4416:
-----------------------------------------

    Assignee: Knut Anders Hatlen

> Handle comparison of two constants as a boolean constant
> --------------------------------------------------------
>
>                 Key: DERBY-4416
>                 URL: https://issues.apache.org/jira/browse/DERBY-4416
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 10.5.3.0
>            Reporter: Knut Anders Hatlen
>            Assignee: Knut Anders Hatlen
>            Priority: Minor
>         Attachments: replaceExpressions.diff
>
>
> In the lack of the boolean data type, Derby forces you to use expressions like 1=1 and 1<>1 to express true and false. Generated SQL statements also tend to use such expressions, and so does Derby in its own meta-data queries.
> Derby has many useful optimizations for boolean true/false. For instance, ProjectRestrictNode and PredicateList are able to eliminate predicates, and in some cases the entire ProjectRestrictNode, if the predicate contains constant true or false values. However, during parsing and compilation, expressions like 1=1 are not rewritten to TRUE, and we don't get any benefit from the boolean optimization code. This leads to more complex, and possibly less efficient, byte code being generated for the statements.
> Also, boolean constants are assigned a selectivity of 0.0 (false) or 1.0 (true), since they will always match no rows when false and all rows when true. The expression 1=1 does however get it's selectivity from the = operator, which means that it'll be 0.1. The same selectivity is assigned to 1=0. Other operators have different selectivity, so 2<3 has the selectivity 0.33, even though the actual selectivity of the expression is the same as 1=1 and TRUE, namely 1.0.
> This leads to oddities like the optimizer choosing a different plan when you change 2<3 to 1=1 in a WHERE clause. See http://mail-archives.apache.org/mod_mbox/db-derby-user/200909.mbox/%3c25531166.post@talk.nabble.com%3e for an example of that.
> If we could go through the query tree and replace occurrences of comparisons between constant values with a boolean constant at bind time, such queries would end up with simpler byte code, and the selectivity passed to the optimizer would be more accurate, possibly resulting in a better plan being chosen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.