You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafodion.apache.org by "David Wayne Birdsall (JIRA)" <ji...@apache.org> on 2015/12/23 17:58:46 UTC
[jira] [Created] (TRAFODION-1723) Row count estimation not accurate
with multiple column families
David Wayne Birdsall created TRAFODION-1723:
-----------------------------------------------
Summary: Row count estimation not accurate with multiple column families
Key: TRAFODION-1723
URL: https://issues.apache.org/jira/browse/TRAFODION-1723
Project: Apache Trafodion
Issue Type: Bug
Components: sql-exe
Affects Versions: 2.0-incubating, 1.3-incubating
Environment: Everywhere
Reporter: David Wayne Birdsall
Assignee: David Wayne Birdsall
JIRA TRAFODION 1618 fixed a problem in the row count estimation logic in HBaseClient.java where the old code did not scale beyond 255 columns. In the review comments for that fix it was noted that both the old code and the new code did not take into account multiple column families (support for which was added in JIRA TRAFODION 1419).
It has since been verified by testing that indeed the row count estimation is inaccurate for multiple column families. Worst case seems to be off by a factor of k where k is the number of column families in the table.
The fix will be in the same code in HBaseClient.java as the fix for JIRA TRAFODION 1618. Instead of comparing just qualifiers, we need to also compare column family names. Or alternatively compare row keys.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)