You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Francis Liu <to...@apache.org> on 2013/05/08 01:38:34 UTC

[DISCUSS] Namespace Delimiter

Hi,

As part of the namespace patch (HBASE-8015). We will need a delimiter to separate namespace name from table name. The obvious choice here would be a dot '.'. Since dot is presently a valid character for table names that would require users to migrate their tables (ie renaming tables) as part of upgrade to 0.96. Another option is to use a different delimiter to avoid the table migration altogether. Thoughts?

-Francis

Re: [DISCUSS] Namespace Delimiter

Posted by Francis Liu <to...@apache.org>.
> I think if we only have a string for fully qualified table name internally
> and rely on parsing the dot we are going to have migration issues somewhere.

What migration issues do you see?

> parameters, but internally IMHO we need separate namespace argument, or if

We can get there incrementally as the change is pervasive. As a first cut well have to use a delimiter.
Also we'll have to use delimiters for zk, hdfs and catalog.

> it is too painful, use different internal separator that will not cause
> backward compat and change it for display only.

Wouldn't it be confusing to have an internal and external delimiter?
We would have to know what's external and internal and then
would have to remember to do the translation when making calls between the two.

On May 8, 2013, at 6:24 PM, Sergey Shelukhin wrote:

> I think if we only have a string for fully qualified table name internally
> and rely on parsing the dot we are going to have migration issues somewhere.
> Externally I agree with Elliot, we can do all kinds of things like add
> parameters, but internally IMHO we need separate namespace argument, or if
> it is too painful, use different internal separator that will not cause
> backward compat and change it for display only.
> 
> On Wed, May 8, 2013 at 6:00 PM, Francis Liu <to...@apache.org> wrote:
> 
>>> I think putting existing tables with "." in table name as part of default
>>> namespace is better choice among the two.
>> 
>> Correct me if I missed something here. That is probably the more difficult
>> of the two as you can't derive membership information by just parsing the
>> table name. Which is what is passed around internally most of the time.
>> Because of that I prefer the auto namespace generation approach.
>> 
>> On May 8, 2013, at 5:11 PM, Ted Yu wrote:
>> 
>>> bq. by recognizing existing tables with "." as part of the default
>>> namespace or automatically create namespaces for tables with dots in
>> them.
>>> 
>>> I think putting existing tables with "." in table name as part of default
>>> namespace is better choice among the two.
>>> 
>>> Cheers
>>> 
>>> On Wed, May 8, 2013 at 5:02 PM, Francis Liu <to...@apache.org> wrote:
>>> 
>>>> There shouldn't be any ambiguity. There's fully-qualified table names
>> and
>>>> there's table names.  Table name constant changes were to make the names
>>>> less funky.
>>>> 
>>>> I like your suggestion since it simplifies migration. Though it seems
>>>> we're kicking the can down the road here. In a way we're avoiding the
>>>> problem by specifying an internal delimiter and adding extra complexity
>> to
>>>> prevent the user from using it. Having a way of specifying a table fully
>>>> qualified seems to be something fundamental and convenient, if we don't
>>>> support one now we'll have even more trouble in the future. Looking at
>> the
>>>> suggestions we can potentially make migration painless by recognizing
>>>> existing tables with "." as part of the default namespace or
>> automatically
>>>> create namespaces for tables with dots in them. Neither requires
>> renaming
>>>> tables. They only need to rename tables if they want to start organizing
>>>> things into namespaces which they will have to do in any scenario.
>>>> 
>>>> -Francis
>>>> 
>>>> On May 8, 2013, at 1:27 PM, Elliott Clark wrote:
>>>> 
>>>>> With this solution there's no naming ambiguity.  There's no
>>>>> overloading table name to actually be two different things.  There's
>>>>> no need for users to rename their tables. Most code that is already
>>>>> written will still be source compatible.  No need to change table name
>>>>> constants or anything like that.
>>>>> 
>>>> 
>>>> 
>> 
>> 


Re: [DISCUSS] Namespace Delimiter

Posted by Elliott Clark <ec...@apache.org>.
* Metrics are created from a HRegion so they can have the namespace
pretty easily.
* Hfile links would be relative to the namespace unless explicit, so
there's no issue there.
* I showed how znodes could be handled.
* I also had examples for the shell.

It seems like Francis is going ahead with namespace auto-generation so
I'll hold my concerns until we see what the implementation looks like.

On Thu, May 9, 2013 at 6:44 PM, Francis Liu <to...@apache.org> wrote:
>> Francis, your proposal for auto-generating the namespaces looks good, but I
>> am wondering whether the user might be surprised later when they find about
>> the all the namespaces.
>
>
> The ones that don't read the release notes will probably be surprised though I don't think it'll be a big issue.
> As it doesn't affect existing behavior, apart from not being able to create new tables with dots.
> And they only have to do extra work if they choose to start using namespaces the way it should be.
>
>> BTW, renaming a table will break snapshots, unless it is done as a clone /
>> restore.
>
> I see, clone/restore was one approach I had in mind to implement renaming.
>
> On May 9, 2013, at 6:00 PM, Enis Söztutar wrote:
>
>> Elliot's suggestion sounds good, but unfortunately there are places where
>> we use the table name as a string. We cannot expect all those places to
>> also have namespaces. Some of the examples are metrics names per table,
>> hfile links encoding the table name in it, and maybe znodes, and of course
>> the shell. I would agree though that we should handle the schama and table
>> names as separate entities within inside hbase.
>>
>> Francis, your proposal for auto-generating the namespaces looks good, but I
>> am wondering whether the user might be surprised later when they find about
>> the all the namespaces.
>>
>> BTW, renaming a table will break snapshots, unless it is done as a clone /
>> restore.
>>
>>
>>
>>
>>
>>
>> On Thu, May 9, 2013 at 4:43 PM, Francis Liu <to...@apache.org> wrote:
>>
>>> I can probably incorporate the migration into the main patch will see how
>>> big it gets. And the rename tool as a subtask.
>>>
>>> On May 9, 2013, at 4:21 PM, Ted Yu wrote:
>>>
>>>> The plan is feasible.
>>>> This would be done in sub-task of HBASE-8015, right ?
>>>>
>>>> Thanks
>>>>
>>>> On Thu, May 9, 2013 at 4:03 PM, Francis Liu <to...@apache.org> wrote:
>>>>
>>>>> Sounds like I should give the auto-generate approach a try.
>>>>>
>>>>> In essence it'll do the following:
>>>>>
>>>>> - Tables without '.' will be moved into the default namespace.
>>>>> - Tables with '.' will be move into new namespaces
>>>>>       - namespaces will be delimited by the last '.' in the table name
>>>>>               ie org.apache.hbase.MyTable, namespace = org.apache.hbase
>>>>> - In both cases the oldTableName is the same as the fullTableName
>>>>> - all existing apis and cli commands will be expecting full table names
>>>>> unless explicitly stated
>>>>> - a RenameTable tool will be provided to rename offline tables
>>>>>
>>>>> Let me know if any clarification is needed.
>>>>>
>>>>> -Francis
>>>>>
>>>>>
>>>>> On May 8, 2013, at 8:40 PM, Stack wrote:
>>>>>
>>>>>> On Wed, May 8, 2013 at 7:03 PM, Ted Yu <yu...@gmail.com> wrote:
>>>>>>
>>>>>>> w.r.t. auto-generate option, we need some heuristics.
>>>>>>>
>>>>>>> e.g. would namespace.schemaname.tablename be supported ?
>>>>>>>
>>>>>>> The auto-generate option may create namespace name out of existing
>>>>> schema
>>>>>>> name.
>>>>>>>
>>>>>>
>>>>>>
>>>>>> There is no schema name in hbase.  James's convention up in phoenix at
>>>>> the
>>>>>> hbase-level is a table name w/ a dot in it.
>>>>>>
>>>>>> This is already a difficult enough issue.  No need to add
>>> complications.
>>>>>>
>>>>>> St.Ack
>>>>>
>>>>>
>>>
>>>
>

Re: [DISCUSS] Namespace Delimiter

Posted by Francis Liu <to...@apache.org>.
> Francis, your proposal for auto-generating the namespaces looks good, but I
> am wondering whether the user might be surprised later when they find about
> the all the namespaces.


The ones that don't read the release notes will probably be surprised though I don't think it'll be a big issue.
As it doesn't affect existing behavior, apart from not being able to create new tables with dots. 
And they only have to do extra work if they choose to start using namespaces the way it should be.

> BTW, renaming a table will break snapshots, unless it is done as a clone /
> restore.

I see, clone/restore was one approach I had in mind to implement renaming.

On May 9, 2013, at 6:00 PM, Enis Söztutar wrote:

> Elliot's suggestion sounds good, but unfortunately there are places where
> we use the table name as a string. We cannot expect all those places to
> also have namespaces. Some of the examples are metrics names per table,
> hfile links encoding the table name in it, and maybe znodes, and of course
> the shell. I would agree though that we should handle the schama and table
> names as separate entities within inside hbase.
> 
> Francis, your proposal for auto-generating the namespaces looks good, but I
> am wondering whether the user might be surprised later when they find about
> the all the namespaces.
> 
> BTW, renaming a table will break snapshots, unless it is done as a clone /
> restore.
> 
> 
> 
> 
> 
> 
> On Thu, May 9, 2013 at 4:43 PM, Francis Liu <to...@apache.org> wrote:
> 
>> I can probably incorporate the migration into the main patch will see how
>> big it gets. And the rename tool as a subtask.
>> 
>> On May 9, 2013, at 4:21 PM, Ted Yu wrote:
>> 
>>> The plan is feasible.
>>> This would be done in sub-task of HBASE-8015, right ?
>>> 
>>> Thanks
>>> 
>>> On Thu, May 9, 2013 at 4:03 PM, Francis Liu <to...@apache.org> wrote:
>>> 
>>>> Sounds like I should give the auto-generate approach a try.
>>>> 
>>>> In essence it'll do the following:
>>>> 
>>>> - Tables without '.' will be moved into the default namespace.
>>>> - Tables with '.' will be move into new namespaces
>>>>       - namespaces will be delimited by the last '.' in the table name
>>>>               ie org.apache.hbase.MyTable, namespace = org.apache.hbase
>>>> - In both cases the oldTableName is the same as the fullTableName
>>>> - all existing apis and cli commands will be expecting full table names
>>>> unless explicitly stated
>>>> - a RenameTable tool will be provided to rename offline tables
>>>> 
>>>> Let me know if any clarification is needed.
>>>> 
>>>> -Francis
>>>> 
>>>> 
>>>> On May 8, 2013, at 8:40 PM, Stack wrote:
>>>> 
>>>>> On Wed, May 8, 2013 at 7:03 PM, Ted Yu <yu...@gmail.com> wrote:
>>>>> 
>>>>>> w.r.t. auto-generate option, we need some heuristics.
>>>>>> 
>>>>>> e.g. would namespace.schemaname.tablename be supported ?
>>>>>> 
>>>>>> The auto-generate option may create namespace name out of existing
>>>> schema
>>>>>> name.
>>>>>> 
>>>>> 
>>>>> 
>>>>> There is no schema name in hbase.  James's convention up in phoenix at
>>>> the
>>>>> hbase-level is a table name w/ a dot in it.
>>>>> 
>>>>> This is already a difficult enough issue.  No need to add
>> complications.
>>>>> 
>>>>> St.Ack
>>>> 
>>>> 
>> 
>> 


Re: [DISCUSS] Namespace Delimiter

Posted by Enis Söztutar <en...@gmail.com>.
Elliot's suggestion sounds good, but unfortunately there are places where
we use the table name as a string. We cannot expect all those places to
also have namespaces. Some of the examples are metrics names per table,
hfile links encoding the table name in it, and maybe znodes, and of course
the shell. I would agree though that we should handle the schama and table
names as separate entities within inside hbase.

Francis, your proposal for auto-generating the namespaces looks good, but I
am wondering whether the user might be surprised later when they find about
the all the namespaces.

BTW, renaming a table will break snapshots, unless it is done as a clone /
restore.






On Thu, May 9, 2013 at 4:43 PM, Francis Liu <to...@apache.org> wrote:

> I can probably incorporate the migration into the main patch will see how
> big it gets. And the rename tool as a subtask.
>
> On May 9, 2013, at 4:21 PM, Ted Yu wrote:
>
> > The plan is feasible.
> > This would be done in sub-task of HBASE-8015, right ?
> >
> > Thanks
> >
> > On Thu, May 9, 2013 at 4:03 PM, Francis Liu <to...@apache.org> wrote:
> >
> >> Sounds like I should give the auto-generate approach a try.
> >>
> >> In essence it'll do the following:
> >>
> >> - Tables without '.' will be moved into the default namespace.
> >> - Tables with '.' will be move into new namespaces
> >>        - namespaces will be delimited by the last '.' in the table name
> >>                ie org.apache.hbase.MyTable, namespace = org.apache.hbase
> >> - In both cases the oldTableName is the same as the fullTableName
> >> - all existing apis and cli commands will be expecting full table names
> >> unless explicitly stated
> >> - a RenameTable tool will be provided to rename offline tables
> >>
> >> Let me know if any clarification is needed.
> >>
> >> -Francis
> >>
> >>
> >> On May 8, 2013, at 8:40 PM, Stack wrote:
> >>
> >>> On Wed, May 8, 2013 at 7:03 PM, Ted Yu <yu...@gmail.com> wrote:
> >>>
> >>>> w.r.t. auto-generate option, we need some heuristics.
> >>>>
> >>>> e.g. would namespace.schemaname.tablename be supported ?
> >>>>
> >>>> The auto-generate option may create namespace name out of existing
> >> schema
> >>>> name.
> >>>>
> >>>
> >>>
> >>> There is no schema name in hbase.  James's convention up in phoenix at
> >> the
> >>> hbase-level is a table name w/ a dot in it.
> >>>
> >>> This is already a difficult enough issue.  No need to add
> complications.
> >>>
> >>> St.Ack
> >>
> >>
>
>

Re: [DISCUSS] Namespace Delimiter

Posted by Francis Liu <to...@apache.org>.
I can probably incorporate the migration into the main patch will see how big it gets. And the rename tool as a subtask.

On May 9, 2013, at 4:21 PM, Ted Yu wrote:

> The plan is feasible.
> This would be done in sub-task of HBASE-8015, right ?
> 
> Thanks
> 
> On Thu, May 9, 2013 at 4:03 PM, Francis Liu <to...@apache.org> wrote:
> 
>> Sounds like I should give the auto-generate approach a try.
>> 
>> In essence it'll do the following:
>> 
>> - Tables without '.' will be moved into the default namespace.
>> - Tables with '.' will be move into new namespaces
>>        - namespaces will be delimited by the last '.' in the table name
>>                ie org.apache.hbase.MyTable, namespace = org.apache.hbase
>> - In both cases the oldTableName is the same as the fullTableName
>> - all existing apis and cli commands will be expecting full table names
>> unless explicitly stated
>> - a RenameTable tool will be provided to rename offline tables
>> 
>> Let me know if any clarification is needed.
>> 
>> -Francis
>> 
>> 
>> On May 8, 2013, at 8:40 PM, Stack wrote:
>> 
>>> On Wed, May 8, 2013 at 7:03 PM, Ted Yu <yu...@gmail.com> wrote:
>>> 
>>>> w.r.t. auto-generate option, we need some heuristics.
>>>> 
>>>> e.g. would namespace.schemaname.tablename be supported ?
>>>> 
>>>> The auto-generate option may create namespace name out of existing
>> schema
>>>> name.
>>>> 
>>> 
>>> 
>>> There is no schema name in hbase.  James's convention up in phoenix at
>> the
>>> hbase-level is a table name w/ a dot in it.
>>> 
>>> This is already a difficult enough issue.  No need to add complications.
>>> 
>>> St.Ack
>> 
>> 


Re: [DISCUSS] Namespace Delimiter

Posted by Ted Yu <yu...@gmail.com>.
The plan is feasible.
This would be done in sub-task of HBASE-8015, right ?

Thanks

On Thu, May 9, 2013 at 4:03 PM, Francis Liu <to...@apache.org> wrote:

> Sounds like I should give the auto-generate approach a try.
>
> In essence it'll do the following:
>
> - Tables without '.' will be moved into the default namespace.
> - Tables with '.' will be move into new namespaces
>         - namespaces will be delimited by the last '.' in the table name
>                 ie org.apache.hbase.MyTable, namespace = org.apache.hbase
> - In both cases the oldTableName is the same as the fullTableName
> - all existing apis and cli commands will be expecting full table names
> unless explicitly stated
> - a RenameTable tool will be provided to rename offline tables
>
> Let me know if any clarification is needed.
>
> -Francis
>
>
> On May 8, 2013, at 8:40 PM, Stack wrote:
>
> > On Wed, May 8, 2013 at 7:03 PM, Ted Yu <yu...@gmail.com> wrote:
> >
> >> w.r.t. auto-generate option, we need some heuristics.
> >>
> >> e.g. would namespace.schemaname.tablename be supported ?
> >>
> >> The auto-generate option may create namespace name out of existing
> schema
> >> name.
> >>
> >
> >
> > There is no schema name in hbase.  James's convention up in phoenix at
> the
> > hbase-level is a table name w/ a dot in it.
> >
> > This is already a difficult enough issue.  No need to add complications.
> >
> > St.Ack
>
>

Re: [DISCUSS] Namespace Delimiter

Posted by Francis Liu <to...@apache.org>.
Sounds like I should give the auto-generate approach a try. 

In essence it'll do the following:

- Tables without '.' will be moved into the default namespace. 
- Tables with '.' will be move into new namespaces 
	- namespaces will be delimited by the last '.' in the table name 
		ie org.apache.hbase.MyTable, namespace = org.apache.hbase
- In both cases the oldTableName is the same as the fullTableName
- all existing apis and cli commands will be expecting full table names unless explicitly stated			
- a RenameTable tool will be provided to rename offline tables

Let me know if any clarification is needed.

-Francis
	

On May 8, 2013, at 8:40 PM, Stack wrote:

> On Wed, May 8, 2013 at 7:03 PM, Ted Yu <yu...@gmail.com> wrote:
> 
>> w.r.t. auto-generate option, we need some heuristics.
>> 
>> e.g. would namespace.schemaname.tablename be supported ?
>> 
>> The auto-generate option may create namespace name out of existing schema
>> name.
>> 
> 
> 
> There is no schema name in hbase.  James's convention up in phoenix at the
> hbase-level is a table name w/ a dot in it.
> 
> This is already a difficult enough issue.  No need to add complications.
> 
> St.Ack


Re: [DISCUSS] Namespace Delimiter

Posted by Stack <st...@duboce.net>.
On Wed, May 8, 2013 at 7:03 PM, Ted Yu <yu...@gmail.com> wrote:

> w.r.t. auto-generate option, we need some heuristics.
>
> e.g. would namespace.schemaname.tablename be supported ?
>
> The auto-generate option may create namespace name out of existing schema
> name.
>


There is no schema name in hbase.  James's convention up in phoenix at the
hbase-level is a table name w/ a dot in it.

This is already a difficult enough issue.  No need to add complications.

St.Ack

Re: [DISCUSS] Namespace Delimiter

Posted by Ted Yu <yu...@gmail.com>.
w.r.t. auto-generate option, we need some heuristics.

e.g. would namespace.schemaname.tablename be supported ?

The auto-generate option may create namespace name out of existing schema
name.

Cheers

On Wed, May 8, 2013 at 6:58 PM, Francis Liu <to...@apache.org> wrote:

> > Looks like using '#' as delimiter becomes an attractive option.
>
> Doesn't the auto-generate option sound reasonable?
>
> On May 8, 2013, at 6:27 PM, Ted Yu wrote:
>
> > bq. use different internal separator that will not cause backward compat
> > and change it for display only.
> >
> > User would copy / paste table name from UI and expect it to be accepted
> by
> > shell, etc.
> >
> > Looks like using '#' as delimiter becomes an attractive option.
> >
> > On Wed, May 8, 2013 at 6:24 PM, Sergey Shelukhin <sergey@hortonworks.com
> >wrote:
> >
> >> I think if we only have a string for fully qualified table name
> internally
> >> and rely on parsing the dot we are going to have migration issues
> >> somewhere.
> >> Externally I agree with Elliot, we can do all kinds of things like add
> >> parameters, but internally IMHO we need separate namespace argument, or
> if
> >> it is too painful, use different internal separator that will not cause
> >> backward compat and change it for display only.
> >>
> >> On Wed, May 8, 2013 at 6:00 PM, Francis Liu <to...@apache.org> wrote:
> >>
> >>>> I think putting existing tables with "." in table name as part of
> >> default
> >>>> namespace is better choice among the two.
> >>>
> >>> Correct me if I missed something here. That is probably the more
> >> difficult
> >>> of the two as you can't derive membership information by just parsing
> the
> >>> table name. Which is what is passed around internally most of the time.
> >>> Because of that I prefer the auto namespace generation approach.
> >>>
> >>> On May 8, 2013, at 5:11 PM, Ted Yu wrote:
> >>>
> >>>> bq. by recognizing existing tables with "." as part of the default
> >>>> namespace or automatically create namespaces for tables with dots in
> >>> them.
> >>>>
> >>>> I think putting existing tables with "." in table name as part of
> >> default
> >>>> namespace is better choice among the two.
> >>>>
> >>>> Cheers
> >>>>
> >>>> On Wed, May 8, 2013 at 5:02 PM, Francis Liu <to...@apache.org>
> wrote:
> >>>>
> >>>>> There shouldn't be any ambiguity. There's fully-qualified table names
> >>> and
> >>>>> there's table names.  Table name constant changes were to make the
> >> names
> >>>>> less funky.
> >>>>>
> >>>>> I like your suggestion since it simplifies migration. Though it seems
> >>>>> we're kicking the can down the road here. In a way we're avoiding the
> >>>>> problem by specifying an internal delimiter and adding extra
> >> complexity
> >>> to
> >>>>> prevent the user from using it. Having a way of specifying a table
> >> fully
> >>>>> qualified seems to be something fundamental and convenient, if we
> >> don't
> >>>>> support one now we'll have even more trouble in the future. Looking
> at
> >>> the
> >>>>> suggestions we can potentially make migration painless by recognizing
> >>>>> existing tables with "." as part of the default namespace or
> >>> automatically
> >>>>> create namespaces for tables with dots in them. Neither requires
> >>> renaming
> >>>>> tables. They only need to rename tables if they want to start
> >> organizing
> >>>>> things into namespaces which they will have to do in any scenario.
> >>>>>
> >>>>> -Francis
> >>>>>
> >>>>> On May 8, 2013, at 1:27 PM, Elliott Clark wrote:
> >>>>>
> >>>>>> With this solution there's no naming ambiguity.  There's no
> >>>>>> overloading table name to actually be two different things.  There's
> >>>>>> no need for users to rename their tables. Most code that is already
> >>>>>> written will still be source compatible.  No need to change table
> >> name
> >>>>>> constants or anything like that.
> >>>>>>
> >>>>>
> >>>>>
> >>>
> >>>
> >>
>
>

Re: [DISCUSS] Namespace Delimiter

Posted by Francis Liu <to...@apache.org>.
> Looks like using '#' as delimiter becomes an attractive option.

Doesn't the auto-generate option sound reasonable?

On May 8, 2013, at 6:27 PM, Ted Yu wrote:

> bq. use different internal separator that will not cause backward compat
> and change it for display only.
> 
> User would copy / paste table name from UI and expect it to be accepted by
> shell, etc.
> 
> Looks like using '#' as delimiter becomes an attractive option.
> 
> On Wed, May 8, 2013 at 6:24 PM, Sergey Shelukhin <se...@hortonworks.com>wrote:
> 
>> I think if we only have a string for fully qualified table name internally
>> and rely on parsing the dot we are going to have migration issues
>> somewhere.
>> Externally I agree with Elliot, we can do all kinds of things like add
>> parameters, but internally IMHO we need separate namespace argument, or if
>> it is too painful, use different internal separator that will not cause
>> backward compat and change it for display only.
>> 
>> On Wed, May 8, 2013 at 6:00 PM, Francis Liu <to...@apache.org> wrote:
>> 
>>>> I think putting existing tables with "." in table name as part of
>> default
>>>> namespace is better choice among the two.
>>> 
>>> Correct me if I missed something here. That is probably the more
>> difficult
>>> of the two as you can't derive membership information by just parsing the
>>> table name. Which is what is passed around internally most of the time.
>>> Because of that I prefer the auto namespace generation approach.
>>> 
>>> On May 8, 2013, at 5:11 PM, Ted Yu wrote:
>>> 
>>>> bq. by recognizing existing tables with "." as part of the default
>>>> namespace or automatically create namespaces for tables with dots in
>>> them.
>>>> 
>>>> I think putting existing tables with "." in table name as part of
>> default
>>>> namespace is better choice among the two.
>>>> 
>>>> Cheers
>>>> 
>>>> On Wed, May 8, 2013 at 5:02 PM, Francis Liu <to...@apache.org> wrote:
>>>> 
>>>>> There shouldn't be any ambiguity. There's fully-qualified table names
>>> and
>>>>> there's table names.  Table name constant changes were to make the
>> names
>>>>> less funky.
>>>>> 
>>>>> I like your suggestion since it simplifies migration. Though it seems
>>>>> we're kicking the can down the road here. In a way we're avoiding the
>>>>> problem by specifying an internal delimiter and adding extra
>> complexity
>>> to
>>>>> prevent the user from using it. Having a way of specifying a table
>> fully
>>>>> qualified seems to be something fundamental and convenient, if we
>> don't
>>>>> support one now we'll have even more trouble in the future. Looking at
>>> the
>>>>> suggestions we can potentially make migration painless by recognizing
>>>>> existing tables with "." as part of the default namespace or
>>> automatically
>>>>> create namespaces for tables with dots in them. Neither requires
>>> renaming
>>>>> tables. They only need to rename tables if they want to start
>> organizing
>>>>> things into namespaces which they will have to do in any scenario.
>>>>> 
>>>>> -Francis
>>>>> 
>>>>> On May 8, 2013, at 1:27 PM, Elliott Clark wrote:
>>>>> 
>>>>>> With this solution there's no naming ambiguity.  There's no
>>>>>> overloading table name to actually be two different things.  There's
>>>>>> no need for users to rename their tables. Most code that is already
>>>>>> written will still be source compatible.  No need to change table
>> name
>>>>>> constants or anything like that.
>>>>>> 
>>>>> 
>>>>> 
>>> 
>>> 
>> 


Re: [DISCUSS] Namespace Delimiter

Posted by Ted Yu <yu...@gmail.com>.
bq. use different internal separator that will not cause backward compat
and change it for display only.

User would copy / paste table name from UI and expect it to be accepted by
shell, etc.

Looks like using '#' as delimiter becomes an attractive option.

On Wed, May 8, 2013 at 6:24 PM, Sergey Shelukhin <se...@hortonworks.com>wrote:

> I think if we only have a string for fully qualified table name internally
> and rely on parsing the dot we are going to have migration issues
> somewhere.
> Externally I agree with Elliot, we can do all kinds of things like add
> parameters, but internally IMHO we need separate namespace argument, or if
> it is too painful, use different internal separator that will not cause
> backward compat and change it for display only.
>
> On Wed, May 8, 2013 at 6:00 PM, Francis Liu <to...@apache.org> wrote:
>
> > > I think putting existing tables with "." in table name as part of
> default
> > > namespace is better choice among the two.
> >
> > Correct me if I missed something here. That is probably the more
> difficult
> > of the two as you can't derive membership information by just parsing the
> > table name. Which is what is passed around internally most of the time.
> > Because of that I prefer the auto namespace generation approach.
> >
> > On May 8, 2013, at 5:11 PM, Ted Yu wrote:
> >
> > > bq. by recognizing existing tables with "." as part of the default
> > > namespace or automatically create namespaces for tables with dots in
> > them.
> > >
> > > I think putting existing tables with "." in table name as part of
> default
> > > namespace is better choice among the two.
> > >
> > > Cheers
> > >
> > > On Wed, May 8, 2013 at 5:02 PM, Francis Liu <to...@apache.org> wrote:
> > >
> > >> There shouldn't be any ambiguity. There's fully-qualified table names
> > and
> > >> there's table names.  Table name constant changes were to make the
> names
> > >> less funky.
> > >>
> > >> I like your suggestion since it simplifies migration. Though it seems
> > >> we're kicking the can down the road here. In a way we're avoiding the
> > >> problem by specifying an internal delimiter and adding extra
> complexity
> > to
> > >> prevent the user from using it. Having a way of specifying a table
> fully
> > >> qualified seems to be something fundamental and convenient, if we
> don't
> > >> support one now we'll have even more trouble in the future. Looking at
> > the
> > >> suggestions we can potentially make migration painless by recognizing
> > >> existing tables with "." as part of the default namespace or
> > automatically
> > >> create namespaces for tables with dots in them. Neither requires
> > renaming
> > >> tables. They only need to rename tables if they want to start
> organizing
> > >> things into namespaces which they will have to do in any scenario.
> > >>
> > >> -Francis
> > >>
> > >> On May 8, 2013, at 1:27 PM, Elliott Clark wrote:
> > >>
> > >>> With this solution there's no naming ambiguity.  There's no
> > >>> overloading table name to actually be two different things.  There's
> > >>> no need for users to rename their tables. Most code that is already
> > >>> written will still be source compatible.  No need to change table
> name
> > >>> constants or anything like that.
> > >>>
> > >>
> > >>
> >
> >
>

Re: [DISCUSS] Namespace Delimiter

Posted by Sergey Shelukhin <se...@hortonworks.com>.
I think if we only have a string for fully qualified table name internally
and rely on parsing the dot we are going to have migration issues somewhere.
Externally I agree with Elliot, we can do all kinds of things like add
parameters, but internally IMHO we need separate namespace argument, or if
it is too painful, use different internal separator that will not cause
backward compat and change it for display only.

On Wed, May 8, 2013 at 6:00 PM, Francis Liu <to...@apache.org> wrote:

> > I think putting existing tables with "." in table name as part of default
> > namespace is better choice among the two.
>
> Correct me if I missed something here. That is probably the more difficult
> of the two as you can't derive membership information by just parsing the
> table name. Which is what is passed around internally most of the time.
> Because of that I prefer the auto namespace generation approach.
>
> On May 8, 2013, at 5:11 PM, Ted Yu wrote:
>
> > bq. by recognizing existing tables with "." as part of the default
> > namespace or automatically create namespaces for tables with dots in
> them.
> >
> > I think putting existing tables with "." in table name as part of default
> > namespace is better choice among the two.
> >
> > Cheers
> >
> > On Wed, May 8, 2013 at 5:02 PM, Francis Liu <to...@apache.org> wrote:
> >
> >> There shouldn't be any ambiguity. There's fully-qualified table names
> and
> >> there's table names.  Table name constant changes were to make the names
> >> less funky.
> >>
> >> I like your suggestion since it simplifies migration. Though it seems
> >> we're kicking the can down the road here. In a way we're avoiding the
> >> problem by specifying an internal delimiter and adding extra complexity
> to
> >> prevent the user from using it. Having a way of specifying a table fully
> >> qualified seems to be something fundamental and convenient, if we don't
> >> support one now we'll have even more trouble in the future. Looking at
> the
> >> suggestions we can potentially make migration painless by recognizing
> >> existing tables with "." as part of the default namespace or
> automatically
> >> create namespaces for tables with dots in them. Neither requires
> renaming
> >> tables. They only need to rename tables if they want to start organizing
> >> things into namespaces which they will have to do in any scenario.
> >>
> >> -Francis
> >>
> >> On May 8, 2013, at 1:27 PM, Elliott Clark wrote:
> >>
> >>> With this solution there's no naming ambiguity.  There's no
> >>> overloading table name to actually be two different things.  There's
> >>> no need for users to rename their tables. Most code that is already
> >>> written will still be source compatible.  No need to change table name
> >>> constants or anything like that.
> >>>
> >>
> >>
>
>

Re: [DISCUSS] Namespace Delimiter

Posted by Ted Yu <yu...@gmail.com>.
bq. as you can't derive membership information by just parsing the table
name.

Can we store additional information, at migration time, in system.namespace
table so that namespace membership can be determined without ambiguity ?

Cheers

On Wed, May 8, 2013 at 6:00 PM, Francis Liu <to...@apache.org> wrote:

> > I think putting existing tables with "." in table name as part of default
> > namespace is better choice among the two.
>
> Correct me if I missed something here. That is probably the more difficult
> of the two as you can't derive membership information by just parsing the
> table name. Which is what is passed around internally most of the time.
> Because of that I prefer the auto namespace generation approach.
>
> On May 8, 2013, at 5:11 PM, Ted Yu wrote:
>
> > bq. by recognizing existing tables with "." as part of the default
> > namespace or automatically create namespaces for tables with dots in
> them.
> >
> > I think putting existing tables with "." in table name as part of default
> > namespace is better choice among the two.
> >
> > Cheers
> >
> > On Wed, May 8, 2013 at 5:02 PM, Francis Liu <to...@apache.org> wrote:
> >
> >> There shouldn't be any ambiguity. There's fully-qualified table names
> and
> >> there's table names.  Table name constant changes were to make the names
> >> less funky.
> >>
> >> I like your suggestion since it simplifies migration. Though it seems
> >> we're kicking the can down the road here. In a way we're avoiding the
> >> problem by specifying an internal delimiter and adding extra complexity
> to
> >> prevent the user from using it. Having a way of specifying a table fully
> >> qualified seems to be something fundamental and convenient, if we don't
> >> support one now we'll have even more trouble in the future. Looking at
> the
> >> suggestions we can potentially make migration painless by recognizing
> >> existing tables with "." as part of the default namespace or
> automatically
> >> create namespaces for tables with dots in them. Neither requires
> renaming
> >> tables. They only need to rename tables if they want to start organizing
> >> things into namespaces which they will have to do in any scenario.
> >>
> >> -Francis
> >>
> >> On May 8, 2013, at 1:27 PM, Elliott Clark wrote:
> >>
> >>> With this solution there's no naming ambiguity.  There's no
> >>> overloading table name to actually be two different things.  There's
> >>> no need for users to rename their tables. Most code that is already
> >>> written will still be source compatible.  No need to change table name
> >>> constants or anything like that.
> >>>
> >>
> >>
>
>

Re: [DISCUSS] Namespace Delimiter

Posted by Francis Liu <to...@apache.org>.
> I think putting existing tables with "." in table name as part of default
> namespace is better choice among the two.

Correct me if I missed something here. That is probably the more difficult of the two as you can't derive membership information by just parsing the table name. Which is what is passed around internally most of the time. Because of that I prefer the auto namespace generation approach. 

On May 8, 2013, at 5:11 PM, Ted Yu wrote:

> bq. by recognizing existing tables with "." as part of the default
> namespace or automatically create namespaces for tables with dots in them.
> 
> I think putting existing tables with "." in table name as part of default
> namespace is better choice among the two.
> 
> Cheers
> 
> On Wed, May 8, 2013 at 5:02 PM, Francis Liu <to...@apache.org> wrote:
> 
>> There shouldn't be any ambiguity. There's fully-qualified table names and
>> there's table names.  Table name constant changes were to make the names
>> less funky.
>> 
>> I like your suggestion since it simplifies migration. Though it seems
>> we're kicking the can down the road here. In a way we're avoiding the
>> problem by specifying an internal delimiter and adding extra complexity to
>> prevent the user from using it. Having a way of specifying a table fully
>> qualified seems to be something fundamental and convenient, if we don't
>> support one now we'll have even more trouble in the future. Looking at the
>> suggestions we can potentially make migration painless by recognizing
>> existing tables with "." as part of the default namespace or automatically
>> create namespaces for tables with dots in them. Neither requires renaming
>> tables. They only need to rename tables if they want to start organizing
>> things into namespaces which they will have to do in any scenario.
>> 
>> -Francis
>> 
>> On May 8, 2013, at 1:27 PM, Elliott Clark wrote:
>> 
>>> With this solution there's no naming ambiguity.  There's no
>>> overloading table name to actually be two different things.  There's
>>> no need for users to rename their tables. Most code that is already
>>> written will still be source compatible.  No need to change table name
>>> constants or anything like that.
>>> 
>> 
>> 


Re: [DISCUSS] Namespace Delimiter

Posted by Ted Yu <yu...@gmail.com>.
bq. by recognizing existing tables with "." as part of the default
namespace or automatically create namespaces for tables with dots in them.

I think putting existing tables with "." in table name as part of default
namespace is better choice among the two.

Cheers

On Wed, May 8, 2013 at 5:02 PM, Francis Liu <to...@apache.org> wrote:

> There shouldn't be any ambiguity. There's fully-qualified table names and
> there's table names.  Table name constant changes were to make the names
> less funky.
>
> I like your suggestion since it simplifies migration. Though it seems
> we're kicking the can down the road here. In a way we're avoiding the
> problem by specifying an internal delimiter and adding extra complexity to
> prevent the user from using it. Having a way of specifying a table fully
> qualified seems to be something fundamental and convenient, if we don't
> support one now we'll have even more trouble in the future. Looking at the
> suggestions we can potentially make migration painless by recognizing
> existing tables with "." as part of the default namespace or automatically
> create namespaces for tables with dots in them. Neither requires renaming
> tables. They only need to rename tables if they want to start organizing
> things into namespaces which they will have to do in any scenario.
>
> -Francis
>
> On May 8, 2013, at 1:27 PM, Elliott Clark wrote:
>
> > With this solution there's no naming ambiguity.  There's no
> > overloading table name to actually be two different things.  There's
> > no need for users to rename their tables. Most code that is already
> > written will still be source compatible.  No need to change table name
> > constants or anything like that.
> >
>
>

Re: [DISCUSS] Namespace Delimiter

Posted by Francis Liu <to...@apache.org>.
There shouldn't be any ambiguity. There's fully-qualified table names and there's table names.  Table name constant changes were to make the names less funky. 

I like your suggestion since it simplifies migration. Though it seems we're kicking the can down the road here. In a way we're avoiding the problem by specifying an internal delimiter and adding extra complexity to prevent the user from using it. Having a way of specifying a table fully qualified seems to be something fundamental and convenient, if we don't support one now we'll have even more trouble in the future. Looking at the suggestions we can potentially make migration painless by recognizing existing tables with "." as part of the default namespace or automatically create namespaces for tables with dots in them. Neither requires renaming tables. They only need to rename tables if they want to start organizing things into namespaces which they will have to do in any scenario.

-Francis

On May 8, 2013, at 1:27 PM, Elliott Clark wrote:

> With this solution there's no naming ambiguity.  There's no
> overloading table name to actually be two different things.  There's
> no need for users to rename their tables. Most code that is already
> written will still be source compatible.  No need to change table name
> constants or anything like that.
> 


Re: [DISCUSS] Namespace Delimiter

Posted by Elliott Clark <ec...@apache.org>.
I've said most of this on the jira but I'll add it to the discussion here.

Allowing table names with dots is easy if we just add a parameter.

HTable(Configuration conf, final String tableName) will just assume
the default.  (This keeps all backwards compatibility for everything
but meta.)
HTable(Configuration conf, final String tableName, String nameSpace)
is added.  This allows namespaces to have naming very similar to
tables (more flexibility for users is better imo).

The shell gets an optional namespace parameter added on to lots of the
commands.  If it's there then take the namespace.  If it's not then
assume default.  Just like HTable this keeps backwards compatibility
for everything but meta.  Yes this is the messiest part, but it's the
least important imo.  The java api is what should be the prime focus.
Let others add a SQL like (or full sql) shell on top of HBase.  We
shouldn't make using the java api more confusing for users who are
used to our api, just so that HBase can be like sql.

Meta's pretty easy.  We already use a comma as a separator for
different parts of the key.

Zk can be handled in much the same way as meta (with a separator
that's already not legal table name).

With this solution there's no naming ambiguity.  There's no
overloading table name to actually be two different things.  There's
no need for users to rename their tables. Most code that is already
written will still be source compatible.  No need to change table name
constants or anything like that.

On Wed, May 8, 2013 at 12:01 PM, Sergey Shelukhin
<se...@hortonworks.com> wrote:
> I think if we want to use a dot, we need to be able to support both old
> tables with dot, and table in namespace. Consequently parsing should not
> rely on a dot or store "truth" info about NS tables as one string with a
> dot, I lost track of the patch a while ago but I think it was considered at
> some point...
> I thought about it a bit and I think the order of resolution should be
> old-dot-table overriding namespace table. Creating tables with a dot should
> not be allowed. Creating tables in namespace that would be shadowed by
> legacy old tables should not be allowed too (e.g. if you have old table
> "foo.bar" trying to create table "bar" in ns "foo" would cause an error).
>
> So users have no inconvenience with legacy tables, and only get
> inconvenienced in a very explicit way if they use namespaces in a certain
> way.
>
>
>
>
> On Tue, May 7, 2013 at 11:55 PM, James Taylor <jt...@salesforce.com>wrote:
>
>> Phoenix uses  <schema name> . <table name> to reference tables, so
>> allowing a "." in names would make parsing ambiguous.
>>
>>     James
>>
>>
>> On 05/07/2013 11:36 PM, Stack wrote:
>>
>>> On Tue, May 7, 2013 at 5:22 PM, Francis Liu <to...@apache.org> wrote:
>>>
>>>  One thing I had in mind was to automatically assume that the first dot
>>>> delimits the namespace name. During upgrade we automatically create those
>>>> namespaces and assign the tables accordingly. They can then eventually
>>>> migrate/rename their tables (if needed) at a later time. In the extreme
>>>> case that would be one namespace per table. For which we will provide a
>>>> tool to rename offline tables.
>>>>
>>>> I'm guessing most cases would not require a rename. What else do people
>>>> use dots in their table name for?
>>>>
>>>>
>>> With namespaces in place, will '.' be illegal in a table name?
>>>
>>> With namespaces, is there a no-namespace/default location?  If so, what
>>> will it be called or how will you refer to tables in the
>>> no-namespace/default namespace?
>>>
>>> I just took a user's production website where there are hundreds of
>>> tables.
>>>   For no good reason that I can see, they happened to have choosen '_' and
>>> '-' as table name partitioner: i.e. application_feature, etc.  My sense is
>>> they could just as easily have gone with '.' but maybe the '.META.' name
>>> frightens people away from '.'?
>>>
>>> Anyone using '.' in their table names?
>>>
>>> St.Ack
>>>
>>
>>

Re: [DISCUSS] Namespace Delimiter

Posted by Sergey Shelukhin <se...@hortonworks.com>.
I think if we want to use a dot, we need to be able to support both old
tables with dot, and table in namespace. Consequently parsing should not
rely on a dot or store "truth" info about NS tables as one string with a
dot, I lost track of the patch a while ago but I think it was considered at
some point...
I thought about it a bit and I think the order of resolution should be
old-dot-table overriding namespace table. Creating tables with a dot should
not be allowed. Creating tables in namespace that would be shadowed by
legacy old tables should not be allowed too (e.g. if you have old table
"foo.bar" trying to create table "bar" in ns "foo" would cause an error).

So users have no inconvenience with legacy tables, and only get
inconvenienced in a very explicit way if they use namespaces in a certain
way.




On Tue, May 7, 2013 at 11:55 PM, James Taylor <jt...@salesforce.com>wrote:

> Phoenix uses  <schema name> . <table name> to reference tables, so
> allowing a "." in names would make parsing ambiguous.
>
>     James
>
>
> On 05/07/2013 11:36 PM, Stack wrote:
>
>> On Tue, May 7, 2013 at 5:22 PM, Francis Liu <to...@apache.org> wrote:
>>
>>  One thing I had in mind was to automatically assume that the first dot
>>> delimits the namespace name. During upgrade we automatically create those
>>> namespaces and assign the tables accordingly. They can then eventually
>>> migrate/rename their tables (if needed) at a later time. In the extreme
>>> case that would be one namespace per table. For which we will provide a
>>> tool to rename offline tables.
>>>
>>> I'm guessing most cases would not require a rename. What else do people
>>> use dots in their table name for?
>>>
>>>
>> With namespaces in place, will '.' be illegal in a table name?
>>
>> With namespaces, is there a no-namespace/default location?  If so, what
>> will it be called or how will you refer to tables in the
>> no-namespace/default namespace?
>>
>> I just took a user's production website where there are hundreds of
>> tables.
>>   For no good reason that I can see, they happened to have choosen '_' and
>> '-' as table name partitioner: i.e. application_feature, etc.  My sense is
>> they could just as easily have gone with '.' but maybe the '.META.' name
>> frightens people away from '.'?
>>
>> Anyone using '.' in their table names?
>>
>> St.Ack
>>
>
>

Re: [DISCUSS] Namespace Delimiter

Posted by James Taylor <jt...@salesforce.com>.
Sorry, didn't explain the current Phoenix behavior correctly. Phoenix 
uses "." in table names currently by combining the <schema name> and 
<table name> together with a "." as the separator.  For example:
CREATE TABLE SCOTT.TIGER (...) would create a table named SCOTT.TIGER. I 
suspect it's pretty common to use a "." in the table name, so changing 
this behavior now would be painful.

     James

On 05/08/2013 11:54 AM, Enis Söztutar wrote:
> "." is the de-facto way of doing something like this, but now I tend to buy
> the argument that forcing people to rename their tables will be a lot of
> trouble.
>
> I think it is reasonable to
>   - Remove "." to be a valid table name character. You won't be able to
> create a table with a "." in the name.
>   - Keep migrated tables under default namespace. default namespace will
> contain tables with dot in their name as well.
>   - If you have a table "a.b", you cannot create a namespace named "a"
>   - Whenever we refer to table "a.b", we can search for namespace "a", if
> not found search for table "a.b" in default namespace.
>
> Would that work.
> Enis
>
>
>
> On Tue, May 7, 2013 at 11:55 PM, James Taylor <jt...@salesforce.com>wrote:
>
>> Phoenix uses  <schema name> . <table name> to reference tables, so
>> allowing a "." in names would make parsing ambiguous.
>>
>>      James
>>
>>
>> On 05/07/2013 11:36 PM, Stack wrote:
>>
>>> On Tue, May 7, 2013 at 5:22 PM, Francis Liu <to...@apache.org> wrote:
>>>
>>>   One thing I had in mind was to automatically assume that the first dot
>>>> delimits the namespace name. During upgrade we automatically create those
>>>> namespaces and assign the tables accordingly. They can then eventually
>>>> migrate/rename their tables (if needed) at a later time. In the extreme
>>>> case that would be one namespace per table. For which we will provide a
>>>> tool to rename offline tables.
>>>>
>>>> I'm guessing most cases would not require a rename. What else do people
>>>> use dots in their table name for?
>>>>
>>>>
>>> With namespaces in place, will '.' be illegal in a table name?
>>>
>>> With namespaces, is there a no-namespace/default location?  If so, what
>>> will it be called or how will you refer to tables in the
>>> no-namespace/default namespace?
>>>
>>> I just took a user's production website where there are hundreds of
>>> tables.
>>>    For no good reason that I can see, they happened to have choosen '_' and
>>> '-' as table name partitioner: i.e. application_feature, etc.  My sense is
>>> they could just as easily have gone with '.' but maybe the '.META.' name
>>> frightens people away from '.'?
>>>
>>> Anyone using '.' in their table names?
>>>
>>> St.Ack
>>>
>>


Re: [DISCUSS] Namespace Delimiter

Posted by Enis Söztutar <en...@gmail.com>.
"." is the de-facto way of doing something like this, but now I tend to buy
the argument that forcing people to rename their tables will be a lot of
trouble.

I think it is reasonable to
 - Remove "." to be a valid table name character. You won't be able to
create a table with a "." in the name.
 - Keep migrated tables under default namespace. default namespace will
contain tables with dot in their name as well.
 - If you have a table "a.b", you cannot create a namespace named "a"
 - Whenever we refer to table "a.b", we can search for namespace "a", if
not found search for table "a.b" in default namespace.

Would that work.
Enis



On Tue, May 7, 2013 at 11:55 PM, James Taylor <jt...@salesforce.com>wrote:

> Phoenix uses  <schema name> . <table name> to reference tables, so
> allowing a "." in names would make parsing ambiguous.
>
>     James
>
>
> On 05/07/2013 11:36 PM, Stack wrote:
>
>> On Tue, May 7, 2013 at 5:22 PM, Francis Liu <to...@apache.org> wrote:
>>
>>  One thing I had in mind was to automatically assume that the first dot
>>> delimits the namespace name. During upgrade we automatically create those
>>> namespaces and assign the tables accordingly. They can then eventually
>>> migrate/rename their tables (if needed) at a later time. In the extreme
>>> case that would be one namespace per table. For which we will provide a
>>> tool to rename offline tables.
>>>
>>> I'm guessing most cases would not require a rename. What else do people
>>> use dots in their table name for?
>>>
>>>
>> With namespaces in place, will '.' be illegal in a table name?
>>
>> With namespaces, is there a no-namespace/default location?  If so, what
>> will it be called or how will you refer to tables in the
>> no-namespace/default namespace?
>>
>> I just took a user's production website where there are hundreds of
>> tables.
>>   For no good reason that I can see, they happened to have choosen '_' and
>> '-' as table name partitioner: i.e. application_feature, etc.  My sense is
>> they could just as easily have gone with '.' but maybe the '.META.' name
>> frightens people away from '.'?
>>
>> Anyone using '.' in their table names?
>>
>> St.Ack
>>
>
>

Re: [DISCUSS] Namespace Delimiter

Posted by James Taylor <jt...@salesforce.com>.
Phoenix uses  <schema name> . <table name> to reference tables, so 
allowing a "." in names would make parsing ambiguous.

     James

On 05/07/2013 11:36 PM, Stack wrote:
> On Tue, May 7, 2013 at 5:22 PM, Francis Liu <to...@apache.org> wrote:
>
>> One thing I had in mind was to automatically assume that the first dot
>> delimits the namespace name. During upgrade we automatically create those
>> namespaces and assign the tables accordingly. They can then eventually
>> migrate/rename their tables (if needed) at a later time. In the extreme
>> case that would be one namespace per table. For which we will provide a
>> tool to rename offline tables.
>>
>> I'm guessing most cases would not require a rename. What else do people
>> use dots in their table name for?
>>
>
> With namespaces in place, will '.' be illegal in a table name?
>
> With namespaces, is there a no-namespace/default location?  If so, what
> will it be called or how will you refer to tables in the
> no-namespace/default namespace?
>
> I just took a user's production website where there are hundreds of tables.
>   For no good reason that I can see, they happened to have choosen '_' and
> '-' as table name partitioner: i.e. application_feature, etc.  My sense is
> they could just as easily have gone with '.' but maybe the '.META.' name
> frightens people away from '.'?
>
> Anyone using '.' in their table names?
>
> St.Ack


Re: [DISCUSS] Namespace Delimiter

Posted by Francis Liu <to...@apache.org>.
In the current implementation it will be only allowed as a namespace delimiter. We can relax that to support applications which support more than one '.' 

ie org.apache.hbase.MyTable

Namespace = org.apache.hbase
Table = MyTable



On May 7, 2013, at 11:36 PM, Stack wrote:

> With namespaces in place, will '.' be illegal in a table name?
> 
> With namespaces, is there a no-namespace/default location?  If so, what
> will it be called or how will you refer to tables in the
> no-namespace/default namespace?
> 
> I just took a user's production website where there are hundreds of tables.
> For no good reason that I can see, they happened to have choosen '_' and
> '-' as table name partitioner: i.e. application_feature, etc.  My sense is
> they could just as easily have gone with '.' but maybe the '.META.' name
> frightens people away from '.'?
> 
> Anyone using '.' in their table names?
> 
> St.Ack


Re: [DISCUSS] Namespace Delimiter

Posted by Stack <st...@duboce.net>.
On Tue, May 7, 2013 at 5:22 PM, Francis Liu <to...@apache.org> wrote:

> One thing I had in mind was to automatically assume that the first dot
> delimits the namespace name. During upgrade we automatically create those
> namespaces and assign the tables accordingly. They can then eventually
> migrate/rename their tables (if needed) at a later time. In the extreme
> case that would be one namespace per table. For which we will provide a
> tool to rename offline tables.
>
> I'm guessing most cases would not require a rename. What else do people
> use dots in their table name for?
>


With namespaces in place, will '.' be illegal in a table name?

With namespaces, is there a no-namespace/default location?  If so, what
will it be called or how will you refer to tables in the
no-namespace/default namespace?

I just took a user's production website where there are hundreds of tables.
 For no good reason that I can see, they happened to have choosen '_' and
'-' as table name partitioner: i.e. application_feature, etc.  My sense is
they could just as easily have gone with '.' but maybe the '.META.' name
frightens people away from '.'?

Anyone using '.' in their table names?

St.Ack

Re: [DISCUSS] Namespace Delimiter

Posted by Francis Liu <to...@apache.org>.
One thing I had in mind was to automatically assume that the first dot delimits the namespace name. During upgrade we automatically create those namespaces and assign the tables accordingly. They can then eventually migrate/rename their tables (if needed) at a later time. In the extreme case that would be one namespace per table. For which we will provide a tool to rename offline tables.

I'm guessing most cases would not require a rename. What else do people use dots in their table name for? 

-Francis

On May 7, 2013, at 4:49 PM, Ian Varley wrote:

> I would also submit that "." is a pretty universal standard (citation needed) in relational databases for separating namespaces (schemas, etc.) from tables. We use that now to represent the same idea, and using a different delimiter would be less than ideal in the long run.
> 
> But, I agree with Jon - anything in the 0.96 upgrade that causes people to change client code in lock-step isn't going to fly.
> 
> Is there any solution which can use "." but be transparent at upgrade time? I.e. you could still refer to it by its full "Namespace.Table" name in client code, and it does a little more work to try both combinations? That'd prevent cases where someone already has tables called both "Y' and "X.Y", but, come on, who does that?
> 
> Ian
> 
> On May 7, 2013, at 6:43 PM, Jonathan Hsieh wrote:
> 
> I prefer using a delimiter that does not require migration.  As someone who
> has to support a wide variety of users, this will cause much less confusion
> from our users (and save me grief!)
> 
> From the code [1], any symbol char other than '.', '_', or '-' would be an
> ok delimiter.  howabout a ':' or '#'?
> 
> [1]
> https://github.com/apache/hbase/blob/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java#L415
> 
> Jon.
> 
> 
> On Tue, May 7, 2013 at 4:38 PM, Francis Liu <to...@apache.org> wrote:
> 
> Hi,
> 
> As part of the namespace patch (HBASE-8015). We will need a delimiter to
> separate namespace name from table name. The obvious choice here would be a
> dot '.'. Since dot is presently a valid character for table names that
> would require users to migrate their tables (ie renaming tables) as part of
> upgrade to 0.96. Another option is to use a different delimiter to avoid
> the table migration altogether. Thoughts?
> 
> -Francis
> 
> 
> 
> 
> --
> // Jonathan Hsieh (shay)
> // Software Engineer, Cloudera
> // jon@cloudera.com
> 


Re: [DISCUSS] Namespace Delimiter

Posted by Shahab Yunus <sh...@gmail.com>.
'.' would have been ideal but I agree with it causing serious migration
issues. I think we can use # It is not a widely used character in names or
naming of objects or variables (I don't have any scientific data, just
observation) also in Pig the key.value in case of accessing the map
representing a HBase column family also uses this. Plus it is already used
as a special character in other technologies (commen\ts in mySQL, temp
tables in MS SQL etc.)


On Tue, May 7, 2013 at 7:57 PM, Ted Yu <yu...@gmail.com> wrote:

> Interesting discussion.
>
> On Tue, May 7, 2013 at 4:49 PM, Ian Varley <iv...@salesforce.com> wrote:
>
> > I would also submit that "." is a pretty universal standard (citation
> > needed) in relational databases for separating namespaces (schemas, etc.)
> > from tables. We use that now to represent the same idea, and using a
> > different delimiter would be less than ideal in the long run.
> >
> > But, I agree with Jon - anything in the 0.96 upgrade that causes people
> to
> > change client code in lock-step isn't going to fly.
> >
> > Is there any solution which can use "." but be transparent at upgrade
> > time? I.e. you could still refer to it by its full "Namespace.Table" name
> > in client code, and it does a little more work to try both combinations?
> > That'd prevent cases where someone already has tables called both "Y' and
> > "X.Y", but, come on, who does that?
> >
> > Ian
> >
> > On May 7, 2013, at 6:43 PM, Jonathan Hsieh wrote:
> >
> > I prefer using a delimiter that does not require migration.  As someone
> who
> > has to support a wide variety of users, this will cause much less
> confusion
> > from our users (and save me grief!)
> >
> > From the code [1], any symbol char other than '.', '_', or '-' would be
> an
> > ok delimiter.  howabout a ':' or '#'?
> >
> > [1]
> >
> >
> https://github.com/apache/hbase/blob/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java#L415
> >
> > Jon.
> >
> >
> > On Tue, May 7, 2013 at 4:38 PM, Francis Liu <to...@apache.org> wrote:
> >
> > Hi,
> >
> > As part of the namespace patch (HBASE-8015). We will need a delimiter to
> > separate namespace name from table name. The obvious choice here would
> be a
> > dot '.'. Since dot is presently a valid character for table names that
> > would require users to migrate their tables (ie renaming tables) as part
> of
> > upgrade to 0.96. Another option is to use a different delimiter to avoid
> > the table migration altogether. Thoughts?
> >
> > -Francis
> >
> >
> >
> >
> > --
> > // Jonathan Hsieh (shay)
> > // Software Engineer, Cloudera
> > // jon@cloudera.com
> >
> >
>

Re: [DISCUSS] Namespace Delimiter

Posted by Shahab Yunus <sh...@gmail.com>.
'.' would have been ideal but I agree with it causing serious migration
issues. I think we can use # It is not a widely used character in names or
naming of objects or variables (I don't have any scientific data, just
observation) also in Pig the key.value in case of accessing the map
representing a HBase column family also uses this. Plus it is already used
as a special character in other technologies (commen\ts in mySQL, temp
tables in MS SQL etc.)


On Tue, May 7, 2013 at 7:57 PM, Ted Yu <yu...@gmail.com> wrote:

> Interesting discussion.
>
> On Tue, May 7, 2013 at 4:49 PM, Ian Varley <iv...@salesforce.com> wrote:
>
> > I would also submit that "." is a pretty universal standard (citation
> > needed) in relational databases for separating namespaces (schemas, etc.)
> > from tables. We use that now to represent the same idea, and using a
> > different delimiter would be less than ideal in the long run.
> >
> > But, I agree with Jon - anything in the 0.96 upgrade that causes people
> to
> > change client code in lock-step isn't going to fly.
> >
> > Is there any solution which can use "." but be transparent at upgrade
> > time? I.e. you could still refer to it by its full "Namespace.Table" name
> > in client code, and it does a little more work to try both combinations?
> > That'd prevent cases where someone already has tables called both "Y' and
> > "X.Y", but, come on, who does that?
> >
> > Ian
> >
> > On May 7, 2013, at 6:43 PM, Jonathan Hsieh wrote:
> >
> > I prefer using a delimiter that does not require migration.  As someone
> who
> > has to support a wide variety of users, this will cause much less
> confusion
> > from our users (and save me grief!)
> >
> > From the code [1], any symbol char other than '.', '_', or '-' would be
> an
> > ok delimiter.  howabout a ':' or '#'?
> >
> > [1]
> >
> >
> https://github.com/apache/hbase/blob/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java#L415
> >
> > Jon.
> >
> >
> > On Tue, May 7, 2013 at 4:38 PM, Francis Liu <to...@apache.org> wrote:
> >
> > Hi,
> >
> > As part of the namespace patch (HBASE-8015). We will need a delimiter to
> > separate namespace name from table name. The obvious choice here would
> be a
> > dot '.'. Since dot is presently a valid character for table names that
> > would require users to migrate their tables (ie renaming tables) as part
> of
> > upgrade to 0.96. Another option is to use a different delimiter to avoid
> > the table migration altogether. Thoughts?
> >
> > -Francis
> >
> >
> >
> >
> > --
> > // Jonathan Hsieh (shay)
> > // Software Engineer, Cloudera
> > // jon@cloudera.com
> >
> >
>

Re: [DISCUSS] Namespace Delimiter

Posted by Ted Yu <yu...@gmail.com>.
Interesting discussion.

On Tue, May 7, 2013 at 4:49 PM, Ian Varley <iv...@salesforce.com> wrote:

> I would also submit that "." is a pretty universal standard (citation
> needed) in relational databases for separating namespaces (schemas, etc.)
> from tables. We use that now to represent the same idea, and using a
> different delimiter would be less than ideal in the long run.
>
> But, I agree with Jon - anything in the 0.96 upgrade that causes people to
> change client code in lock-step isn't going to fly.
>
> Is there any solution which can use "." but be transparent at upgrade
> time? I.e. you could still refer to it by its full "Namespace.Table" name
> in client code, and it does a little more work to try both combinations?
> That'd prevent cases where someone already has tables called both "Y' and
> "X.Y", but, come on, who does that?
>
> Ian
>
> On May 7, 2013, at 6:43 PM, Jonathan Hsieh wrote:
>
> I prefer using a delimiter that does not require migration.  As someone who
> has to support a wide variety of users, this will cause much less confusion
> from our users (and save me grief!)
>
> From the code [1], any symbol char other than '.', '_', or '-' would be an
> ok delimiter.  howabout a ':' or '#'?
>
> [1]
>
> https://github.com/apache/hbase/blob/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java#L415
>
> Jon.
>
>
> On Tue, May 7, 2013 at 4:38 PM, Francis Liu <to...@apache.org> wrote:
>
> Hi,
>
> As part of the namespace patch (HBASE-8015). We will need a delimiter to
> separate namespace name from table name. The obvious choice here would be a
> dot '.'. Since dot is presently a valid character for table names that
> would require users to migrate their tables (ie renaming tables) as part of
> upgrade to 0.96. Another option is to use a different delimiter to avoid
> the table migration altogether. Thoughts?
>
> -Francis
>
>
>
>
> --
> // Jonathan Hsieh (shay)
> // Software Engineer, Cloudera
> // jon@cloudera.com
>
>

Re: [DISCUSS] Namespace Delimiter

Posted by Ted Yu <yu...@gmail.com>.
Interesting discussion.

On Tue, May 7, 2013 at 4:49 PM, Ian Varley <iv...@salesforce.com> wrote:

> I would also submit that "." is a pretty universal standard (citation
> needed) in relational databases for separating namespaces (schemas, etc.)
> from tables. We use that now to represent the same idea, and using a
> different delimiter would be less than ideal in the long run.
>
> But, I agree with Jon - anything in the 0.96 upgrade that causes people to
> change client code in lock-step isn't going to fly.
>
> Is there any solution which can use "." but be transparent at upgrade
> time? I.e. you could still refer to it by its full "Namespace.Table" name
> in client code, and it does a little more work to try both combinations?
> That'd prevent cases where someone already has tables called both "Y' and
> "X.Y", but, come on, who does that?
>
> Ian
>
> On May 7, 2013, at 6:43 PM, Jonathan Hsieh wrote:
>
> I prefer using a delimiter that does not require migration.  As someone who
> has to support a wide variety of users, this will cause much less confusion
> from our users (and save me grief!)
>
> From the code [1], any symbol char other than '.', '_', or '-' would be an
> ok delimiter.  howabout a ':' or '#'?
>
> [1]
>
> https://github.com/apache/hbase/blob/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java#L415
>
> Jon.
>
>
> On Tue, May 7, 2013 at 4:38 PM, Francis Liu <to...@apache.org> wrote:
>
> Hi,
>
> As part of the namespace patch (HBASE-8015). We will need a delimiter to
> separate namespace name from table name. The obvious choice here would be a
> dot '.'. Since dot is presently a valid character for table names that
> would require users to migrate their tables (ie renaming tables) as part of
> upgrade to 0.96. Another option is to use a different delimiter to avoid
> the table migration altogether. Thoughts?
>
> -Francis
>
>
>
>
> --
> // Jonathan Hsieh (shay)
> // Software Engineer, Cloudera
> // jon@cloudera.com
>
>

Re: [DISCUSS] Namespace Delimiter

Posted by Ian Varley <iv...@salesforce.com>.
I would also submit that "." is a pretty universal standard (citation needed) in relational databases for separating namespaces (schemas, etc.) from tables. We use that now to represent the same idea, and using a different delimiter would be less than ideal in the long run.

But, I agree with Jon - anything in the 0.96 upgrade that causes people to change client code in lock-step isn't going to fly.

Is there any solution which can use "." but be transparent at upgrade time? I.e. you could still refer to it by its full "Namespace.Table" name in client code, and it does a little more work to try both combinations? That'd prevent cases where someone already has tables called both "Y' and "X.Y", but, come on, who does that?

Ian

On May 7, 2013, at 6:43 PM, Jonathan Hsieh wrote:

I prefer using a delimiter that does not require migration.  As someone who
has to support a wide variety of users, this will cause much less confusion
from our users (and save me grief!)

>From the code [1], any symbol char other than '.', '_', or '-' would be an
ok delimiter.  howabout a ':' or '#'?

[1]
https://github.com/apache/hbase/blob/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java#L415

Jon.


On Tue, May 7, 2013 at 4:38 PM, Francis Liu <to...@apache.org> wrote:

Hi,

As part of the namespace patch (HBASE-8015). We will need a delimiter to
separate namespace name from table name. The obvious choice here would be a
dot '.'. Since dot is presently a valid character for table names that
would require users to migrate their tables (ie renaming tables) as part of
upgrade to 0.96. Another option is to use a different delimiter to avoid
the table migration altogether. Thoughts?

-Francis




--
// Jonathan Hsieh (shay)
// Software Engineer, Cloudera
// jon@cloudera.com


Re: [DISCUSS] Namespace Delimiter

Posted by Jonathan Hsieh <jo...@cloudera.com>.
I prefer using a delimiter that does not require migration.  As someone who
has to support a wide variety of users, this will cause much less confusion
from our users (and save me grief!)

>From the code [1], any symbol char other than '.', '_', or '-' would be an
ok delimiter.  howabout a ':' or '#'?

[1]
https://github.com/apache/hbase/blob/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java#L415

Jon.


On Tue, May 7, 2013 at 4:38 PM, Francis Liu <to...@apache.org> wrote:

> Hi,
>
> As part of the namespace patch (HBASE-8015). We will need a delimiter to
> separate namespace name from table name. The obvious choice here would be a
> dot '.'. Since dot is presently a valid character for table names that
> would require users to migrate their tables (ie renaming tables) as part of
> upgrade to 0.96. Another option is to use a different delimiter to avoid
> the table migration altogether. Thoughts?
>
> -Francis




-- 
// Jonathan Hsieh (shay)
// Software Engineer, Cloudera
// jon@cloudera.com