You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Jinfeng Ni <ji...@gmail.com> on 2014/11/13 19:42:38 UTC

SQL identifier maximum length

Hi,

Calcite (formly Optiq) has set the maximum length for an SQL identifier to
128.  This sounds quite reasonable for an ordinary sql identifier.
However, for system like Drill which allows user to query a file directly,
it's quite likely to run out of the allowed identifier length.

Drill allows the use case of :

         select * from dfs.schema.`directory1/directory2/filename.json`;

The directory plus the filename is parsed as a SQL identifer. Since many
file system would allow file name up to 255 bytes [1], it's likely user
could use an identifier with more than 128 bytes when query the file
directly.

I would like to ask the Calcite community whether there is any performance
impact if we bump up the maximum length for a SQL identifier.  If there is
negligible performance impact, then, can we make the maximum length
configurable in the SQL parser?  We could make slightly modification to
CombinedParser.jj template, to allow each system to set its own max length,
with default still being 128.

The approach to allow a configurable setting seems to be match what
Postgres [2] is doing, which allows the length to be "raised by changing
the NAMEDATALEN constant in src/include/pg_config_manual.h." [2]


Any suggestions? Thanks!

Jinfeng



1. http://en.wikipedia.org/wiki/Comparison_of_file_systems
2.
http://www.postgresql.org/docs/current/interactive/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS

Re: SQL identifier maximum length

Posted by Julian Hyde <ju...@hydromatic.net>.
+1

Each system should be able to choose its own identifier max length.

I’d prefer to make it configurable at runtime. Then we can test it. Add an extra parameter to SqlParser’s constructor, and have it call parser.setIdentifierMaxLength, similar to how it calls setQuotedCasing right now.

Or if you want to go further, create 

interface SqlParser.Config {
 int identifierMaxLength();
 Casing quotedCasing();
 Casing unquotedCasing();
}

and replace all of the parameters of SqlParser’s constructor with one Config parameter.

Can you log a jira case for this.

Julian


On Nov 13, 2014, at 10:42 AM, Jinfeng Ni <ji...@gmail.com> wrote:

> Hi,
> 
> Calcite (formly Optiq) has set the maximum length for an SQL identifier to
> 128.  This sounds quite reasonable for an ordinary sql identifier.
> However, for system like Drill which allows user to query a file directly,
> it's quite likely to run out of the allowed identifier length.
> 
> Drill allows the use case of :
> 
>         select * from dfs.schema.`directory1/directory2/filename.json`;
> 
> The directory plus the filename is parsed as a SQL identifer. Since many
> file system would allow file name up to 255 bytes [1], it's likely user
> could use an identifier with more than 128 bytes when query the file
> directly.
> 
> I would like to ask the Calcite community whether there is any performance
> impact if we bump up the maximum length for a SQL identifier.  If there is
> negligible performance impact, then, can we make the maximum length
> configurable in the SQL parser?  We could make slightly modification to
> CombinedParser.jj template, to allow each system to set its own max length,
> with default still being 128.
> 
> The approach to allow a configurable setting seems to be match what
> Postgres [2] is doing, which allows the length to be "raised by changing
> the NAMEDATALEN constant in src/include/pg_config_manual.h." [2]
> 
> 
> Any suggestions? Thanks!
> 
> Jinfeng
> 
> 
> 
> 1. http://en.wikipedia.org/wiki/Comparison_of_file_systems
> 2.
> http://www.postgresql.org/docs/current/interactive/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS