You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "Kristian Waagan (Created) (JIRA)" <ji...@apache.org> on 2012/04/04 22:39:25 UTC

[jira] [Created] (DERBY-5684) Explore possible istat daemon improvements and optimizations

Explore possible istat daemon improvements and optimizations
------------------------------------------------------------

                 Key: DERBY-5684
                 URL: https://issues.apache.org/jira/browse/DERBY-5684
             Project: Derby
          Issue Type: Task
            Reporter: Kristian Waagan


A task tracking some experiments on the istat daemon and the statistics update code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (DERBY-5684) Explore possible istat daemon improvements and optimizations

Posted by "Dag H. Wanvik (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273654#comment-13273654 ] 

Dag H. Wanvik commented on DERBY-5684:
--------------------------------------

Looked at the latest patch - seems to do what you describe to me :) Some comments and questions:

FromBaseTable:

- for (int i=0; i < cds.length; i++) {                          
      ConglomerateDescriptor tmpCd = cds[i];                    
      if (tmpCd.isIndex()) {                                    
          IndexRowGenerator irg = tmpCd.getIndexDescriptor();   
          // The case where we have a table with a              
          // single-column primary key/index is pretty          
          // common, so avoid engaging the istat daemon.        
          if (!irg.isUnique() ||                                
                  irg.numberOfOrderedColumns() > 1) {           
              qualified = true;                                 
              break;                                            
          }                                                     
      }                                                         
  }                                                             

  I think I'd reverse the logic here for clarity to:

  for (int i=0; i < cds.length; i++) {                          
      ConglomerateDescriptor tmpCd = cds[i];                    
      if (tmpCd.isIndex()) {                                    
          IndexRowGenerator irg = tmpCd.getIndexDescriptor();   
          // The case where we have a table with a              
          // single-column primary key/index is pretty          
          // common, so avoid engaging the istat daemon. 
          if (irg.isUnique() && irg.numberOfOrderedColumns() == 1) {
              continue;
          }

          qualified = true;                                 
          break;         
      }                                                         
  }                                                             


- IndexStatisticsAnalyzer:
> Without this restriction the algorithm will drop valid statistics entries.
Change wording to "would drop".

Essentially a set: qualifiedCds, implement as such? (e.g. "contains" predicate cleaner than a search loop).

What scenario corresponds to this latter case below (irg == null):?

if (!cd.isIndex() || irg == null) {
    return DISQUALIFIED;
}

- IndexStatisticsDaemonImpl:

// TODO: Is it ok to always invalidate, or should logic be different
//       for background and explicit mode?

What are the pros and cons here? I would assume both should invalidate?

                
> Explore possible istat daemon improvements and optimizations
> ------------------------------------------------------------
>
>                 Key: DERBY-5684
>                 URL: https://issues.apache.org/jira/browse/DERBY-5684
>             Project: Derby
>          Issue Type: Task
>            Reporter: Kristian Waagan
>         Attachments: derby-istat-exploration-1.diff, derby-istat-exploration-2.diff, derby-istat-exploration.diff
>
>
> A task tracking some experiments on the istat daemon and the statistics update code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (DERBY-5684) Explore possible istat daemon improvements and optimizations

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13278792#comment-13278792 ] 

Kristian Waagan commented on DERBY-5684:
----------------------------------------

I have decided to pull the plug on the possible optimization for foreign key indexes. I don't understand it well enough, and it is hard to verify what it actually does wrt the choices taken by the optimizer.
With that decision, the following pieces remain:
 a) Drop disposable statistics entries. I will post the patch under DERBY-5680.
 b) Skip statistics for single-column primary keys. I will post the patch under DERBY-3790 after (a) is done.
 c) Move invalidation when updating statistics. Committed as DERBY-5770.
                
> Explore possible istat daemon improvements and optimizations
> ------------------------------------------------------------
>
>                 Key: DERBY-5684
>                 URL: https://issues.apache.org/jira/browse/DERBY-5684
>             Project: Derby
>          Issue Type: Task
>            Reporter: Kristian Waagan
>         Attachments: derby-istat-exploration-1.diff, derby-istat-exploration-2.diff, derby-istat-exploration-3.diff, derby-istat-exploration.diff
>
>
> A task tracking some experiments on the istat daemon and the statistics update code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (DERBY-5684) Explore possible istat daemon improvements and optimizations

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13276733#comment-13276733 ] 

Kristian Waagan commented on DERBY-5684:
----------------------------------------

DW>> What scenario corresponds to this latter case below (irg == null):?
KW> I don't remember... I'll remove irg == null and see if I added it because a test fell over (I think so, but maybe I had a bug somewhere else that caused this behavior). 

I added the following code to the ConglomerateDescriptor  constructor and ran suites.All (indexable and indexRowGenerator are both declared as final):
+        if (indexable && indexRowGenerator == null) {
+            throw new IllegalStateException("irg is null, indexable is true");
+        }

I didn't see any test failures, so my conclusion is that the check in IndexStatisticsAnalyzer can be dropped.
                
> Explore possible istat daemon improvements and optimizations
> ------------------------------------------------------------
>
>                 Key: DERBY-5684
>                 URL: https://issues.apache.org/jira/browse/DERBY-5684
>             Project: Derby
>          Issue Type: Task
>            Reporter: Kristian Waagan
>         Attachments: derby-istat-exploration-1.diff, derby-istat-exploration-2.diff, derby-istat-exploration.diff
>
>
> A task tracking some experiments on the istat daemon and the statistics update code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (DERBY-5684) Explore possible istat daemon improvements and optimizations

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-5684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan updated DERBY-5684:
-----------------------------------

    Attachment: derby-istat-exploration-2.diff

Another partial version.
I have done a few more changes since the last run, but an earlier version made suites.All run without failures.
I still need to look into upgrade and upgrade testing.

Uploading now because I need to look into TableDescriptor / DataDictionary to progress.
                
> Explore possible istat daemon improvements and optimizations
> ------------------------------------------------------------
>
>                 Key: DERBY-5684
>                 URL: https://issues.apache.org/jira/browse/DERBY-5684
>             Project: Derby
>          Issue Type: Task
>            Reporter: Kristian Waagan
>         Attachments: derby-istat-exploration-1.diff, derby-istat-exploration-2.diff, derby-istat-exploration.diff
>
>
> A task tracking some experiments on the istat daemon and the statistics update code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (DERBY-5684) Explore possible istat daemon improvements and optimizations

Posted by "Kristian Waagan (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-5684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan updated DERBY-5684:
-----------------------------------

    Attachment: derby-istat-exploration.diff

Attaching a first patch showing some experiments I did on the istat code (based on existing JIRAs and the current discussions on derby-dev).

This is highly experimental code.
There are some failures in the tests.

Feel free to comment on the general approach, take the patch for a spin,  and/or suggest further improvements :)

What it does:
 o deletes orphaned stats
 o don't generate statistics for single-col unique indexes
 o use index row estimate (instead of scanning) for single-row non-unique indexes backed by a unique index (i.e. foreign key)

(ps: I'll be offline for some vacation, so don't expect any quick answers from me)
                
> Explore possible istat daemon improvements and optimizations
> ------------------------------------------------------------
>
>                 Key: DERBY-5684
>                 URL: https://issues.apache.org/jira/browse/DERBY-5684
>             Project: Derby
>          Issue Type: Task
>            Reporter: Kristian Waagan
>         Attachments: derby-istat-exploration.diff
>
>
> A task tracking some experiments on the istat daemon and the statistics update code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (DERBY-5684) Explore possible istat daemon improvements and optimizations

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/DERBY-5684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13274664#comment-13274664 ] 

Kristian Waagan commented on DERBY-5684:
----------------------------------------

Thanks for the comments, Dag.

I've stumbled across some challenges when it comes to testing, and I've also identified some other problems with the patch you reviewed.
See my comments below.

DW> I think I'd reverse the logic here for clarity to: ...
I'll consider this in the next rev of the patch.

DW> Essentially a set: qualifiedCds, implement as such?
Yes, that'll probably make the code more readable. In store even lists are pretty much avoided, so I think've been influenced by that pattern! However, the code in the IndexStatisticsAnalyzer isn't performance critical, so it's better to favor readability. The code has changed by now, but I'll see if the suggestion is still applicable.

DW> What scenario corresponds to this latter case below (irg == null):?
I don't remember... I'll remove irg == null and see if I added it because a test fell over (I think so, but maybe I had a bug somewhere else that caused this behavior).

DW> What are the pros and cons here? I would assume both should invalidate?
Yes, both should invalidate, the question is when. The daemon can commit stuff as it goes, not sure how this works with alter table. I'll check this more.

I'm rerunning tests with a newer version of the patch. I'm getting a bit worried the changes are getting too big to push this close to the release, but we can take that discussion when the patch is ready :) I believe there are things we can to to reduce the risks.
                
> Explore possible istat daemon improvements and optimizations
> ------------------------------------------------------------
>
>                 Key: DERBY-5684
>                 URL: https://issues.apache.org/jira/browse/DERBY-5684
>             Project: Derby
>          Issue Type: Task
>            Reporter: Kristian Waagan
>         Attachments: derby-istat-exploration-1.diff, derby-istat-exploration-2.diff, derby-istat-exploration.diff
>
>
> A task tracking some experiments on the istat daemon and the statistics update code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (DERBY-5684) Explore possible istat daemon improvements and optimizations

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-5684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan updated DERBY-5684:
-----------------------------------

    Attachment: derby-istat-exploration-3.diff

Attaching the latest patch (3).
                
> Explore possible istat daemon improvements and optimizations
> ------------------------------------------------------------
>
>                 Key: DERBY-5684
>                 URL: https://issues.apache.org/jira/browse/DERBY-5684
>             Project: Derby
>          Issue Type: Task
>            Reporter: Kristian Waagan
>         Attachments: derby-istat-exploration-1.diff, derby-istat-exploration-2.diff, derby-istat-exploration-3.diff, derby-istat-exploration.diff
>
>
> A task tracking some experiments on the istat daemon and the statistics update code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (DERBY-5684) Explore possible istat daemon improvements and optimizations

Posted by "Kristian Waagan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/DERBY-5684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kristian Waagan updated DERBY-5684:
-----------------------------------

    Attachment: derby-istat-exploration-1.diff

Attaching a new patch with some code that didn't get included in the previous diff (git diff fingertrouble).
                
> Explore possible istat daemon improvements and optimizations
> ------------------------------------------------------------
>
>                 Key: DERBY-5684
>                 URL: https://issues.apache.org/jira/browse/DERBY-5684
>             Project: Derby
>          Issue Type: Task
>            Reporter: Kristian Waagan
>         Attachments: derby-istat-exploration-1.diff, derby-istat-exploration.diff
>
>
> A task tracking some experiments on the istat daemon and the statistics update code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira