You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "A B (JIRA)" <de...@db.apache.org> on 2006/09/29 23:27:20 UTC

[jira] Created: (DERBY-1908) Investigate: What's the "unit" for optimizer cost estimates?

Investigate: What's the "unit" for optimizer cost estimates?
------------------------------------------------------------

                 Key: DERBY-1908
                 URL: http://issues.apache.org/jira/browse/DERBY-1908
             Project: Derby
          Issue Type: Task
          Components: Performance, SQL
            Reporter: A B


Derby optimizer decisions are necessarily based on cost estimates.  But what are "units" for these cost estimates?  There is logic in OptimizerImpl.getNextPermutation() that treats cost estimates as if their unit is milliseconds--but is that really the case?

The answer to that question may in fact be "Yes, the units are milliseconds"--and maybe the unexpected cost estimates that are sometimes seen are really caused by something else (ex. DERBY-1905).  But if that's the case, it would be great to look at the optimizer costing code (see esp. FromBaseTable.estimateCost()) to verify that all of the "magic" of costing really makes sense given that the underlying unit is supposed to be milliseconds.

Also, if the stats/cost estimate calculations are truly meant to be in terms of milliseconds, I can't help but wonder on what machine/criteria the determination of milliseconds is based.  Is it time to update the stats for "modern" machines, or perhaps (shooting for the sky) to dynamically adjust the millisecond stats based on the machine that's running Derby and use the adjusted values somehow?  I have no answers to these questions, but I think it would be great if someone out there was inclined to discuss/investigate these kinds of questions a bit more...

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (DERBY-1908) Investigate: What's the "unit" for optimizer cost estimates?

Posted by "Mike Matrigali (JIRA)" <de...@db.apache.org>.
     [ http://issues.apache.org/jira/browse/DERBY-1908?page=all ]

Mike Matrigali updated DERBY-1908:
----------------------------------


Here is the "units" view from the storage layer, which I believe should be the basis for all the optimizer costs.  

The actual interface does not specify a unit.  This originally was a decision to allow for a number of different
implementations.  The guarantee was that across all calls one could compare the cost to another cost and
get reasonable results.  Having said that the actual implementation of the costs returned by store have always
been based on ms. elapsed time for a set of basic  operations.  These basic operations were run and then 
a set of constants defined.  The last time this was done was quite awhile ago, on probably a 400mhz machine.

The "hidden" unit of ms. was broken when the optimizer added timeout - which is basically a decision to stop
optimizing once the estimated cost is less than the elapsed time of the compile.  At this point something outside
the interface assumed the unit was ms.

I think a good direction would be to change the interfaces to somehow try to support costs as truly elapsed time, 
fix at the least the defaults to be based on a modern machine, fix any optimizer code that may not be currently
treating the cost unit correctly (like multiplying a cost by a cost), and maybe look at dynamically sizing the costs based
on current machine operations.  

I will look around for the old unit tests that produced the original costs.  You can see the constants used in
C:/p4/m1/opensource/java/engine/org/apache/derby/iapi/store/access/StoreCostController.java.  

> Investigate: What's the "unit" for optimizer cost estimates?
> ------------------------------------------------------------
>
>                 Key: DERBY-1908
>                 URL: http://issues.apache.org/jira/browse/DERBY-1908
>             Project: Derby
>          Issue Type: Task
>          Components: SQL, Performance
>            Reporter: A B
>
> Derby optimizer decisions are necessarily based on cost estimates.  But what are "units" for these cost estimates?  There is logic in OptimizerImpl.getNextPermutation() that treats cost estimates as if their unit is milliseconds--but is that really the case?
> The answer to that question may in fact be "Yes, the units are milliseconds"--and maybe the unexpected cost estimates that are sometimes seen are really caused by something else (ex. DERBY-1905).  But if that's the case, it would be great to look at the optimizer costing code (see esp. FromBaseTable.estimateCost()) to verify that all of the "magic" of costing really makes sense given that the underlying unit is supposed to be milliseconds.
> Also, if the stats/cost estimate calculations are truly meant to be in terms of milliseconds, I can't help but wonder on what machine/criteria the determination of milliseconds is based.  Is it time to update the stats for "modern" machines, or perhaps (shooting for the sky) to dynamically adjust the millisecond stats based on the machine that's running Derby and use the adjusted values somehow?  I have no answers to these questions, but I think it would be great if someone out there was inclined to discuss/investigate these kinds of questions a bit more...

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Closed: (DERBY-1908) Investigate: What's the "unit" for optimizer cost estimates?

Posted by "Rick Hillegas (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rick Hillegas closed DERBY-1908.
--------------------------------

    Resolution: Won't Fix

It looks to me as though the question was answered.

> Investigate: What's the "unit" for optimizer cost estimates?
> ------------------------------------------------------------
>
>                 Key: DERBY-1908
>                 URL: https://issues.apache.org/jira/browse/DERBY-1908
>             Project: Derby
>          Issue Type: Task
>          Components: SQL
>            Reporter: A B
>
> Derby optimizer decisions are necessarily based on cost estimates.  But what are "units" for these cost estimates?  There is logic in OptimizerImpl.getNextPermutation() that treats cost estimates as if their unit is milliseconds--but is that really the case?
> The answer to that question may in fact be "Yes, the units are milliseconds"--and maybe the unexpected cost estimates that are sometimes seen are really caused by something else (ex. DERBY-1905).  But if that's the case, it would be great to look at the optimizer costing code (see esp. FromBaseTable.estimateCost()) to verify that all of the "magic" of costing really makes sense given that the underlying unit is supposed to be milliseconds.
> Also, if the stats/cost estimate calculations are truly meant to be in terms of milliseconds, I can't help but wonder on what machine/criteria the determination of milliseconds is based.  Is it time to update the stats for "modern" machines, or perhaps (shooting for the sky) to dynamically adjust the millisecond stats based on the machine that's running Derby and use the adjusted values somehow?  I have no answers to these questions, but I think it would be great if someone out there was inclined to discuss/investigate these kinds of questions a bit more...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (DERBY-1908) Investigate: What's the "unit" for optimizer cost estimates?

Posted by "Dag H. Wanvik (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dag H. Wanvik updated DERBY-1908:
---------------------------------

    Derby Categories: [Performance]

> Investigate: What's the "unit" for optimizer cost estimates?
> ------------------------------------------------------------
>
>                 Key: DERBY-1908
>                 URL: https://issues.apache.org/jira/browse/DERBY-1908
>             Project: Derby
>          Issue Type: Task
>          Components: SQL
>            Reporter: A B
>
> Derby optimizer decisions are necessarily based on cost estimates.  But what are "units" for these cost estimates?  There is logic in OptimizerImpl.getNextPermutation() that treats cost estimates as if their unit is milliseconds--but is that really the case?
> The answer to that question may in fact be "Yes, the units are milliseconds"--and maybe the unexpected cost estimates that are sometimes seen are really caused by something else (ex. DERBY-1905).  But if that's the case, it would be great to look at the optimizer costing code (see esp. FromBaseTable.estimateCost()) to verify that all of the "magic" of costing really makes sense given that the underlying unit is supposed to be milliseconds.
> Also, if the stats/cost estimate calculations are truly meant to be in terms of milliseconds, I can't help but wonder on what machine/criteria the determination of milliseconds is based.  Is it time to update the stats for "modern" machines, or perhaps (shooting for the sky) to dynamically adjust the millisecond stats based on the machine that's running Derby and use the adjusted values somehow?  I have no answers to these questions, but I think it would be great if someone out there was inclined to discuss/investigate these kinds of questions a bit more...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.