You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafodion.apache.org by "David Wayne Birdsall (JIRA)" <ji...@apache.org> on 2018/10/18 22:14:00 UTC

[jira] [Resolved] (TRAFODION-3223) Row count estimation code works poorly on time-ordered aged-out data

     [ https://issues.apache.org/jira/browse/TRAFODION-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David Wayne Birdsall resolved TRAFODION-3223.
---------------------------------------------
       Resolution: Fixed
    Fix Version/s: 2.4

> Row count estimation code works poorly on time-ordered aged-out data
> --------------------------------------------------------------------
>
>                 Key: TRAFODION-3223
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-3223
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: sql-cmp
>    Affects Versions: any
>            Reporter: David Wayne Birdsall
>            Assignee: David Wayne Birdsall
>            Priority: Major
>             Fix For: 2.4
>
>
> The estimateRowCountBody method in module HBaseClient.java samples cells from the first 500 rows from the first HFile it sees in order to estimate the number of rows in a Trafodion table. If the table happens to have a time-ordered key, and data are aged out over time, we can get large clumps of "delete" tombstones in one or more HFiles. If estimateRowCountBody happens to look at such an HFile, it will incorrectly conclude that most cells are "delete" tombstones and therefore drastically underestimate the row count.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)