You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Jinfeng Ni (JIRA)" <ji...@apache.org> on 2014/11/13 22:46:34 UTC

[jira] [Created] (CALCITE-464) Make parser accept configurable max length for SQL identifier

Jinfeng Ni created CALCITE-464:
----------------------------------

             Summary: Make parser accept configurable max length for SQL identifier
                 Key: CALCITE-464
                 URL: https://issues.apache.org/jira/browse/CALCITE-464
             Project: Calcite
          Issue Type: Improvement
            Reporter: Jinfeng Ni
            Assignee: Julian Hyde


Calcite currently uses 128 as the maximum length for sql identifier.  We would like to enhance the SQL parser, so that each system should be able to choose its own identifier max length. 

The following is the discussion copied from the dev mailing list:

===============================================================
[Jinfeng]

Calcite (formly Optiq) has set the maximum length for an SQL identifier to 128.  This sounds quite reasonable for an ordinary sql identifier.  However, for system like Drill which allows user to query a file directly, it's quite likely to run out of the allowed identifier length. 

Drill allows the use case of :
   
         select * from dfs.schema.`directory1/directory2/filename.json`;

The directory plus the filename is parsed as a SQL identifer. Since many file system would allow file name up to 255 bytes [1], it's likely user could use an identifier with more than 128 bytes when query the file directly.  

I would like to ask the Calcite community whether there is any performance impact if we bump up the maximum length for a SQL identifier.  If there is negligible performance impact, then, can we make the maximum length configurable in the SQL parser?  We could make slightly modification to CombinedParser.jj template, to allow each system to set its own max length, with default still being 128.

The approach to allow a configurable setting seems to be match what Postgres [2] is doing, which allows the length to be "raised by changing the NAMEDATALEN constant in src/include/pg_config_manual.h." [2]


[Julian]

Each system should be able to choose its own identifier max length.

I’d prefer to make it configurable at runtime. Then we can test it. Add an extra parameter to SqlParser’s constructor, and have it call parser.setIdentifierMaxLength, similar to how it calls setQuotedCasing right now.

Or if you want to go further, create

interface SqlParser.Config {
 int identifierMaxLength();
 Casing quotedCasing();
 Casing unquotedCasing();
}

and replace all of the parameters of SqlParser’s constructor with one Config parameter.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)