You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafodion.apache.org by "David Wayne Birdsall (JIRA)" <ji...@apache.org> on 2015/12/23 17:58:46 UTC

[jira] [Created] (TRAFODION-1723) Row count estimation not accurate with multiple column families

David Wayne Birdsall created TRAFODION-1723:
-----------------------------------------------

             Summary: Row count estimation not accurate with multiple column families
                 Key: TRAFODION-1723
                 URL: https://issues.apache.org/jira/browse/TRAFODION-1723
             Project: Apache Trafodion
          Issue Type: Bug
          Components: sql-exe
    Affects Versions: 2.0-incubating, 1.3-incubating
         Environment: Everywhere
            Reporter: David Wayne Birdsall
            Assignee: David Wayne Birdsall


JIRA TRAFODION 1618 fixed a problem in the row count estimation logic in HBaseClient.java where the old code did not scale beyond 255 columns. In the review comments for that fix it was noted that both the old code and the new code did not take into account multiple column families (support for which was added in JIRA TRAFODION 1419).

It has since been verified by testing that indeed the row count estimation is inaccurate for multiple column families. Worst case seems to be off by a factor of k where k is the number of column families in the table.

The fix will be in the same code in HBaseClient.java as the fix for JIRA TRAFODION 1618. Instead of comparing just qualifiers, we need to also compare column family names. Or alternatively compare row keys.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)