You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by Stas Sukhanov <st...@gmail.com> on 2017/09/07 11:17:03 UTC

SchemaUtil.getTableNameFromFullName() do not respect IS_NAMESPACE_MAPPING_ENABLED flag

Hi,

We experienced one problem in hdp 4.7.0.2.6.1.0-129 (includes schema mapping feature from 4.8) and I can see the same problem in master. The problem itself quite easy to understand, but what it can break is a big question.

Class org.apache.phoenix.util,SchemaUtil has getSchemaNameFromFullName <https://github.com/apache/phoenix/blob/b46cbd375e3d2ee9a11644825c13937572c027cd/phoenix-core/src/main/java/org/apache/phoenix/util/SchemaUtil.java#L641> and getTableNameFromFullName <https://github.com/apache/phoenix/blob/b46cbd375e3d2ee9a11644825c13937572c027cd/phoenix-core/src/main/java/org/apache/phoenix/util/SchemaUtil.java#L691> methods that do not respect IS_NAMESPACE_MAPPING_ENABLED flag. Moreover methods treat namespace as a schema even though flag is supposed to be FALSE by default. I am pretty surprised that it wasn’t discovered before and might be I am wrong. I found only one related bug and wrote comment PHOENIX-3460 <https://issues.apache.org/jira/browse/PHOENIX-3460>.

The problem can cause quite nasty bugs. Please find method PhoenixRuntime#generateColumnInfo <https://github.com/apache/phoenix/blob/b46cbd375e3d2ee9a11644825c13937572c027cd/phoenix-core/src/main/java/org/apache/phoenix/util/PhoenixRuntime.java#L469> (btw there is another method in SchemaUtil with another implementation) and see how it works with table name and cache. The method calls PhoenixRuntime#getTable <https://github.com/apache/phoenix/blob/b46cbd375e3d2ee9a11644825c13937572c027cd/phoenix-core/src/main/java/org/apache/phoenix/util/PhoenixRuntime.java#L442> with normalized table name and if cache doesn’t contain the table, it tries to update cache but with table name returned by getTableNameFromFullName and fails.

Consider an example (schema mapping feature is disabled):

generateColumnInfo("\"ns:my_table\"") -> getTable("ns:my_table") -> MetaDataClient#updateCache("ns", "my_table")

Therefore it looks up for the table "ns:my_table" in cache but updates cache with "ns.my_table" and throws exception with "ns:my_table". So this problem depends on the cache state which makes it really nasty. The exception with a wrong table name is also something to consider.

Alright, questions:

1) Am I right?
2) Is there any plan to get rid of those methods?
3) Is there any plan to fix those methods or namespace mapping is must be always turned on?

If there any suggestion how it should be fixed I can help with that. In our project we changed methods to default behavior (IS_NAMESPACE_MAPPING_ENABLED=false) but I do not see clear way to fetch this flag from config.

// Stas