You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@phoenix.apache.org by Marcin Januszkiewicz <ka...@gmail.com> on 2017/11/28 08:55:57 UTC

Phoenix uses String representation of columns that cannot be correctly parsed.

Hi,

I've been doing some investigating into PHOENIX-3696 (
https://issues.apache.org/jira/browse/PHOENIX-3696).
I've found that many Phoenix utility methods operate on string
representations on column names while making incorrect assumptions about
the shape of the column name,
For example, the method ColumnInfo.getDisplayName assumes that if a period
exists in a column name, then it must be a column family separator.
SchemaUtil.generateColumnInfo will separate column family from column names
with a period, but won't escape them, making it impossible to parse back to
the proper column name.
The PhoenixRuntime.getColumnInfo will throw an Esception if there is more
than one period in the column name, even if properly escaped.
This pattern is present within the public API of Phoenix, so I don't think
it can be change in a backward-incompatible way.
I think the proper way of fixing this is to introduce a new column
identifier class that can be passed around instead, and only using strings
when we need a serialized value.
Then we could deprecate the string-based methods.
What do you think?