You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "William Yang (JIRA)" <ji...@apache.org> on 2016/01/08 04:14:39 UTC

[jira] [Commented] (PHOENIX-2520) Create DDL property for metadata update frequency

    [ https://issues.apache.org/jira/browse/PHOENIX-2520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15088634#comment-15088634 ] 

William Yang commented on PHOENIX-2520:
---------------------------------------

Updating the meta data periodically has a problem that if some connection changes the table meta in the middle of a period, then current connection will have to wait for the rest of the period before it can update the meta. Even if it knows that there's a new version of meta it cannot pick it up until current meta is 'old' enough. So I propose the following solution:

update meta data once(at the first time creating the TableRef object) and then update on demand. 

1. table meta change can be categroized into 3 types: 'ALWAYS' (current behaviour), 'NEVER', 'RARELY'. For the last two types, we should only update table meta at the first time we access the table and then update by demand. 
2. If user add a new column, and execute a SQL contains the new column with an old connection, SQL compilation will fail for MetaDataEntityNotFoundException, so we know that it is time to update table meta explicitly and retry compiling. 
3. The defect of this solution is that it cannot handle column deletion. If some connections remove a column, the old connections can still access the deleted column until it get re-opened. If user cannot accept this behaviour he should choose the 'ALWAYS' type. So a switch should be introduced. 

new configurations in hbase-site.xml:
<property>
   <name>phoenix.functions.preferMetaCache.enabled</name>
   <value>true</value>
</property>
<property>
   <name>phoenix.functions.preferMetaCache.enabled.your_table_name</name>
   <value>false</value>
</property>

I introduce two level configs here: global config and table-level config. If you enable the 'preferMetaCache' property, table meta will be updated once and then update by demand. Otherwise, it will update table meta every time (current behaviour). Users are responsible to decide which mode to use.

Since 'NEVER' and 'RARELY' and 'add column' are most common cases, this solution will work well for most scenarios. And for those minority cases you can choose 'ALWAYS'. 
More details, see patch 'preferMetaCache.patch'.

> Create DDL property for metadata update frequency
> -------------------------------------------------
>
>                 Key: PHOENIX-2520
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2520
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>         Attachments: preferMetaCache.patch
>
>
> On the client-side, Phoenix pings the server when a query is compiled to confirm that the client has the most up-to-date metadata for the table being queried. For some tables that are known to not change, this RPC is wasteful. 
> We can allow a property such as {{UPDATE_METADATA_CACHE_FREQUENCY_MS}} to be specified a time to wait before checking with the server to see if the metadata has changed. This could be specified in the CREATE TABLE call and stored in the SYSTEM.CATALOG table header row. By default the value could be 0 which would keep the current behavior. Tables that never change could use Long.MAX_VALUE. Potentially we could allow 'ALWAYS' and 'NEVER' values for convenience.
> Proposed implementation:
> - add {{public long getAge()}} method to {{PTableRef}}.
> - when setting lastAccessTime, also store System.currentMillis() to new {{setAccessTime}} private member variable
> - the getAge() would return {{System.currentMillis() - setAccessTime}}
> - code in MetaDataClient would prevent call to server if age < {{UPDATE_METADATA_CACHE_FREQUENCY_MS}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)