You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by 徐涛 <ha...@gmail.com> on 2018/08/23 07:11:33 UTC

lack of function and low usability of provided function

Hi All,
	I found flink is lack of some basic functions , for example string split, regular express support, json parse and extract support, these function are used frequently in development , but they are not supported, use has to write UDF to support this.
	And some of the provided functions are lack of usability, for example log(2, 1.0) and exp(1.0)  with double params are not supported. I think they are not hard to implement and they are very basic functions.
	Will flink enhance the basic functions , maybe in later releases?

Best,
Henry

Re: lack of function and low usability of provided function

Posted by 徐涛 <ha...@gmail.com>.
Found another function which does not implement the function as it declared.  https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/table/functions.html#temporal-functions <https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/table/functions.html#temporal-functions>
The function is TIMESTAMP string.

I use the sql as follows, and the ttt type is String.
insert into rslt select word,cast( TIMESTAMPADD(HOUR, 2,   TIMESTAMP ttt    ) as varchar) from xxx

But it throws the following exception, the Flink version is 1.7.2.
Exception in thread "main" org.apache.flink.table.api.SqlParserException: SQL parse failed. Encountered "TIMESTAMP ttt" at line 1, column 60.
Was expecting one of:
    "+" ...
    "-" ...
    "NOT" ...
    "EXISTS" ...
    <UNSIGNED_INTEGER_LITERAL> ...
    <DECIMAL_NUMERIC_LITERAL> ...
    <APPROX_NUMERIC_LITERAL> ...
    <BINARY_STRING_LITERAL> ...
    <PREFIXED_STRING_LITERAL> ...
    <QUOTED_STRING> ...
    <UNICODE_STRING_LITERAL> ...
    "TRUE" ...
    "FALSE" ...
    "UNKNOWN" ...
    "NULL" ...
    <LBRACE_D> ...
    <LBRACE_T> ...
    <LBRACE_TS> ...
    "DATE" ...
    "TIME" ...
    "TIMESTAMP" <QUOTED_STRING> ...
    "INTERVAL" ...
    "?" ...
    "CAST" ...
    "EXTRACT" ...
    "POSITION" ...
    "CONVERT" ...
    "TRANSLATE" ...
    "OVERLAY" ...
    "FLOOR" ...
    "CEIL" ...
    "CEILING" ...
    "SUBSTRING" ...
    "TRIM" ...
    "CLASSIFIER" ...
    "MATCH_NUMBER" ...
    "RUNNING" ...
    "PREV" ...
    "NEXT" ...
    <LBRACE_FN> ...
    "MULTISET" ...
    "ARRAY" ...
    "PERIOD" ...
    "SPECIFIC" ...
    <IDENTIFIER> ...
    <QUOTED_IDENTIFIER> ...
    <BACK_QUOTED_IDENTIFIER> ...
    <BRACKET_QUOTED_IDENTIFIER> ...
    <UNICODE_QUOTED_IDENTIFIER> ...
    "ABS" ...
    "AVG" ...
    "CARDINALITY" ...
    "CHAR_LENGTH" ...
    "CHARACTER_LENGTH" ...
    "COALESCE" ...
    "COLLECT" ...
    "COVAR_POP" ...
    "COVAR_SAMP" ...
    "CUME_DIST" ...
    "COUNT" ...
    "CURRENT_DATE" ...
    "CURRENT_TIME" ...
    "CURRENT_TIMESTAMP" ...
    "DENSE_RANK" ...
    "ELEMENT" ...
    "EXP" ...
    "FIRST_VALUE" ...
    "FUSION" ...
    "GROUPING" ...
    "HOUR" ...
    "LAG" ...
    "LEAD" ...
    "LAST_VALUE" ...
    "LN" ...
    "LOCALTIME" ...
    "LOCALTIMESTAMP" ...
    "LOWER" ...
    "MAX" ...
    "MIN" ...
    "MINUTE" ...
    "MOD" ...
    "MONTH" ...
    "NTH_VALUE" ...
    "NTILE" ...
    "NULLIF" ...
    "OCTET_LENGTH" ...
    "PERCENT_RANK" ...
    "POWER" ...
    "RANK" ...
    "REGR_SXX" ...
    "REGR_SYY" ...
    "ROW_NUMBER" ...
    "SECOND" ...
    "SQRT" ...
    "STDDEV_POP" ...
    "STDDEV_SAMP" ...
    "SUM" ...
    "UPPER" ...
    "TRUNCATE" ...
    "USER" ...
    "VAR_POP" ...
    "VAR_SAMP" ...
    "YEAR" ...
    "CURRENT_CATALOG" ...
    "CURRENT_DEFAULT_TRANSFORM_GROUP" ...
    "CURRENT_PATH" ...
    "CURRENT_ROLE" ...
    "CURRENT_SCHEMA" ...
    "CURRENT_USER" ...
    "SESSION_USER" ...
    "SYSTEM_USER" ...
    "NEW" ...
    "CASE" ...
    "CURRENT" ...
    "CURSOR" ...
    "ROW" ...
    "(" ...
    
  at org.apache.flink.table.calcite.FlinkPlannerImpl.parse(FlinkPlannerImpl.scala:94)
  at org.apache.flink.table.api.TableEnvironment.sqlUpdate(TableEnvironment.scala:803)
  at org.apache.flink.table.api.TableEnvironment.sqlUpdate(TableEnvironment.scala:777)
  at com.ximalaya.flink.dsl.application.simple.HelloWorldTable$.main(HelloWorldTable.scala:44)
  at com.ximalaya.flink.dsl.application.simple.HelloWorldTable.main(HelloWorldTable.scala)
Caused by: org.apache.calcite.sql.parser.SqlParseException: Encountered "TIMESTAMP ttt" at line 1, column 60.
Was expecting one of:
    "+" ...
    "-" ...
    "NOT" ...
    "EXISTS" ...
    <UNSIGNED_INTEGER_LITERAL> ...
    <DECIMAL_NUMERIC_LITERAL> ...
    <APPROX_NUMERIC_LITERAL> ...
    <BINARY_STRING_LITERAL> ...
    <PREFIXED_STRING_LITERAL> ...
    <QUOTED_STRING> ...
    <UNICODE_STRING_LITERAL> ...
    "TRUE" ...
    "FALSE" ...
    "UNKNOWN" ...
Disconnected from the target VM, address: '127.0.0.1:55077', transport: 'socket'
    "NULL" ...
    <LBRACE_D> ...
    <LBRACE_T> ...
    <LBRACE_TS> ...
    "DATE" ...
    "TIME" ...
    "TIMESTAMP" <QUOTED_STRING> ...
    "INTERVAL" ...
    "?" ...
    "CAST" ...
    "EXTRACT" ...
    "POSITION" ...
    "CONVERT" ...
    "TRANSLATE" ...
    "OVERLAY" ...
    "FLOOR" ...
    "CEIL" ...
    "CEILING" ...
    "SUBSTRING" ...
    "TRIM" ...
    "CLASSIFIER" ...
    "MATCH_NUMBER" ...
    "RUNNING" ...
    "PREV" ...
    "NEXT" ...
    <LBRACE_FN> ...
    "MULTISET" ...
    "ARRAY" ...
    "PERIOD" ...
    "SPECIFIC" ...
    <IDENTIFIER> ...
    <QUOTED_IDENTIFIER> ...
    <BACK_QUOTED_IDENTIFIER> ...
    <BRACKET_QUOTED_IDENTIFIER> ...
    <UNICODE_QUOTED_IDENTIFIER> ...
    "ABS" ...
    "AVG" ...
    "CARDINALITY" ...
    "CHAR_LENGTH" ...
    "CHARACTER_LENGTH" ...
    "COALESCE" ...
    "COLLECT" ...
    "COVAR_POP" ...
    "COVAR_SAMP" ...
    "CUME_DIST" ...
    "COUNT" ...
    "CURRENT_DATE" ...
    "CURRENT_TIME" ...
    "CURRENT_TIMESTAMP" ...
    "DENSE_RANK" ...
    "ELEMENT" ...
    "EXP" ...
    "FIRST_VALUE" ...
    "FUSION" ...
    "GROUPING" ...
    "HOUR" ...
    "LAG" ...
    "LEAD" ...
    "LAST_VALUE" ...
    "LN" ...
    "LOCALTIME" ...
    "LOCALTIMESTAMP" ...
    "LOWER" ...
    "MAX" ...
    "MIN" ...
    "MINUTE" ...
    "MOD" ...
    "MONTH" ...
    "NTH_VALUE" ...
    "NTILE" ...
    "NULLIF" ...
    "OCTET_LENGTH" ...
    "PERCENT_RANK" ...
    "POWER" ...
    "RANK" ...
    "REGR_SXX" ...
    "REGR_SYY" ...
    "ROW_NUMBER" ...
    "SECOND" ...
    "SQRT" ...
    "STDDEV_POP" ...
    "STDDEV_SAMP" ...
    "SUM" ...
    "UPPER" ...
    "TRUNCATE" ...
    "USER" ...
    "VAR_POP" ...
    "VAR_SAMP" ...
    "YEAR" ...
    "CURRENT_CATALOG" ...
    "CURRENT_DEFAULT_TRANSFORM_GROUP" ...
    "CURRENT_PATH" ...
    "CURRENT_ROLE" ...
    "CURRENT_SCHEMA" ...
    "CURRENT_USER" ...
    "SESSION_USER" ...
    "SYSTEM_USER" ...
    "NEW" ...
    "CASE" ...
    "CURRENT" ...
    "CURSOR" ...
    "ROW" ...
    "(" ...
    
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.convertException(SqlParserImpl.java:347)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.normalizeException(SqlParserImpl.java:128)
  at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:137)
  at org.apache.calcite.sql.parser.SqlParser.parseStmt(SqlParser.java:162)
  at org.apache.flink.table.calcite.FlinkPlannerImpl.parse(FlinkPlannerImpl.scala:90)
  ... 4 more
Caused by: org.apache.calcite.sql.parser.impl.ParseException: Encountered "TIMESTAMP ttt" at line 1, column 60.
Was expecting one of:
    "+" ...
    "-" ...
    "NOT" ...
    "EXISTS" ...
    <UNSIGNED_INTEGER_LITERAL> ...
    <DECIMAL_NUMERIC_LITERAL> ...
    <APPROX_NUMERIC_LITERAL> ...
    <BINARY_STRING_LITERAL> ...
    <PREFIXED_STRING_LITERAL> ...
    <QUOTED_STRING> ...
    <UNICODE_STRING_LITERAL> ...
    "TRUE" ...
    "FALSE" ...
    "UNKNOWN" ...
    "NULL" ...
    <LBRACE_D> ...
    <LBRACE_T> ...
    <LBRACE_TS> ...
    "DATE" ...
    "TIME" ...
    "TIMESTAMP" <QUOTED_STRING> ...
    "INTERVAL" ...
    "?" ...
    "CAST" ...
    "EXTRACT" ...
    "POSITION" ...
    "CONVERT" ...
    "TRANSLATE" ...
    "OVERLAY" ...
    "FLOOR" ...
    "CEIL" ...
    "CEILING" ...
    "SUBSTRING" ...
    "TRIM" ...
    "CLASSIFIER" ...
    "MATCH_NUMBER" ...
    "RUNNING" ...
    "PREV" ...
    "NEXT" ...
    <LBRACE_FN> ...
    "MULTISET" ...
    "ARRAY" ...
    "PERIOD" ...
    "SPECIFIC" ...
    <IDENTIFIER> ...
    <QUOTED_IDENTIFIER> ...
    <BACK_QUOTED_IDENTIFIER> ...
    <BRACKET_QUOTED_IDENTIFIER> ...
    <UNICODE_QUOTED_IDENTIFIER> ...
    "ABS" ...
    "AVG" ...
    "CARDINALITY" ...
    "CHAR_LENGTH" ...
    "CHARACTER_LENGTH" ...
    "COALESCE" ...
    "COLLECT" ...
    "COVAR_POP" ...
    "COVAR_SAMP" ...
    "CUME_DIST" ...
    "COUNT" ...
    "CURRENT_DATE" ...
    "CURRENT_TIME" ...
    "CURRENT_TIMESTAMP" ...
    "DENSE_RANK" ...
    "ELEMENT" ...
    "EXP" ...
    "FIRST_VALUE" ...
    "FUSION" ...
    "GROUPING" ...
    "HOUR" ...
    "LAG" ...
    "LEAD" ...
    "LAST_VALUE" ...
    "LN" ...
    "LOCALTIME" ...
    "LOCALTIMESTAMP" ...
    "LOWER" ...
    "MAX" ...
    "MIN" ...
    "MINUTE" ...
    "MOD" ...
    "MONTH" ...
    "NTH_VALUE" ...
    "NTILE" ...
    "NULLIF" ...
    "OCTET_LENGTH" ...
    "PERCENT_RANK" ...
    "POWER" ...
    "RANK" ...
    "REGR_SXX" ...
    "REGR_SYY" ...
    "ROW_NUMBER" ...
    "SECOND" ...
    "SQRT" ...
    "STDDEV_POP" ...
    "STDDEV_SAMP" ...
    "SUM" ...
    "UPPER" ...
    "TRUNCATE" ...
    "USER" ...
    "VAR_POP" ...
    "VAR_SAMP" ...
    "YEAR" ...
    "CURRENT_CATALOG" ...
    "CURRENT_DEFAULT_TRANSFORM_GROUP" ...
    "CURRENT_PATH" ...
    "CURRENT_ROLE" ...
    "CURRENT_SCHEMA" ...
    "CURRENT_USER" ...
    "SESSION_USER" ...
    "SYSTEM_USER" ...
    "NEW" ...
    "CASE" ...
    "CURRENT" ...
    "CURSOR" ...
    "ROW" ...
    "(" ...
    
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.generateParseException(SqlParserImpl.java:23019)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.jj_consume_token(SqlParserImpl.java:22836)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression3(SqlParserImpl.java:3379)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression2b(SqlParserImpl.java:3066)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression2(SqlParserImpl.java:3092)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression(SqlParserImpl.java:3045)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.TimestampAddFunctionCall(SqlParserImpl.java:5317)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.BuiltinFunctionCall(SqlParserImpl.java:5281)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.AtomicRowExpression(SqlParserImpl.java:3474)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression3(SqlParserImpl.java:3319)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression2b(SqlParserImpl.java:3066)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression2(SqlParserImpl.java:3092)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression(SqlParserImpl.java:3045)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.BuiltinFunctionCall(SqlParserImpl.java:5051)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.AtomicRowExpression(SqlParserImpl.java:3474)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression3(SqlParserImpl.java:3319)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression2b(SqlParserImpl.java:3066)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression2(SqlParserImpl.java:3092)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression(SqlParserImpl.java:3045)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.SelectExpression(SqlParserImpl.java:1525)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.SelectItem(SqlParserImpl.java:1500)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.SelectList(SqlParserImpl.java:1487)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.SqlSelect(SqlParserImpl.java:912)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.LeafQuery(SqlParserImpl.java:552)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.LeafQueryOrExpr(SqlParserImpl.java:3030)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.QueryOrExpr(SqlParserImpl.java:2949)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.OrderedQueryOrExpr(SqlParserImpl.java:463)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.SqlInsert(SqlParserImpl.java:1212)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.SqlStmt(SqlParserImpl.java:847)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.SqlStmtEof(SqlParserImpl.java:869)
  at org.apache.calcite.sql.parser.impl.SqlParserImpl.parseSqlStmtEof(SqlParserImpl.java:184)
  at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:130)
  ... 6 more


Best
Henry

> 在 2018年8月23日,下午4:11,Fabian Hueske <fh...@gmail.com> 写道:
> 
> Hi Henry,
> 
> Flink is an open source project. New build-in functions are constantly contributed to Flink. Right now, there are more than 5 PRs open to add or improve various functions.
> 
> If you find that some functions are not working correctly or could be improved, you can open a Jira issue. The same applies to missing functions. 
> Before opening an issue, it would be good to check if there's another issue for the same problem.
> After opening a Jira issue, you can either wait for somebody to pick it up or contribute the function yourself.
> The scalar function umbrella Jira issue [1] explains what needs to be done to add a function.
> 
> Contributing scalar functions is a good way to get involved with the Flink development.
> 
> Best, Fabian
> 
> [1] https://issues.apache.org/jira/browse/FLINK-6810 <https://issues.apache.org/jira/browse/FLINK-6810>
> 
> Am Do., 23. Aug. 2018 um 09:16 Uhr schrieb 徐涛 <happydexutao@gmail.com <ma...@gmail.com>>:
> Hi All,
>         I found flink is lack of some basic functions , for example string split, regular express support, json parse and extract support, these function are used frequently in development , but they are not supported, use has to write UDF to support this.
>         And some of the provided functions are lack of usability, for example log(2, 1.0) and exp(1.0)  with double params are not supported. I think they are not hard to implement and they are very basic functions.
>         Will flink enhance the basic functions , maybe in later releases?
> 
> Best,
> Henry


Re: lack of function and low usability of provided function

Posted by Fabian Hueske <fh...@gmail.com>.
Hi Henry,

Flink is an open source project. New build-in functions are constantly
contributed to Flink. Right now, there are more than 5 PRs open to add or
improve various functions.

If you find that some functions are not working correctly or could be
improved, you can open a Jira issue. The same applies to missing functions.
Before opening an issue, it would be good to check if there's another issue
for the same problem.
After opening a Jira issue, you can either wait for somebody to pick it up
or contribute the function yourself.
The scalar function umbrella Jira issue [1] explains what needs to be done
to add a function.

Contributing scalar functions is a good way to get involved with the Flink
development.

Best, Fabian

[1] https://issues.apache.org/jira/browse/FLINK-6810

Am Do., 23. Aug. 2018 um 09:16 Uhr schrieb 徐涛 <ha...@gmail.com>:

> Hi All,
>         I found flink is lack of some basic functions , for example string
> split, regular express support, json parse and extract support, these
> function are used frequently in development , but they are not supported,
> use has to write UDF to support this.
>         And some of the provided functions are lack of usability, for
> example log(2, 1.0) and exp(1.0)  with double params are not supported. I
> think they are not hard to implement and they are very basic functions.
>         Will flink enhance the basic functions , maybe in later releases?
>
> Best,
> Henry

Re: lack of function and low usability of provided function

Posted by Timo Walther <tw...@apache.org>.
Hi Henry,

thanks for giving feedback. The set of built-in functions is a continous 
effort that will never be considered as "done". If you think a function 
should be supported, you can open issues in FLINK-6810 and we can 
discuss its priority.

Flink is an open source project so feel also free to contribute either 
by opening issues, reviewing existing PRs, or code.

Regards,
Timo


Am 23.08.18 um 10:01 schrieb vino yang:
> Hi Henry,
>
> I recently submitted some PRs about Scalar functions, some of which 
> have been merged and some are being reviewed, and some may be what you 
> need.
>
> Log2(x) :https://issues.apache.org/jira/browse/FLINK-9928 will be 
> released in Flink 1.7
> exp(x): exists here 
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/sql.html
> regular express support: use similar to , also see here 
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/sql.html
> regex extract : https://issues.apache.org/jira/browse/FLINK-9990 
> reviewing
> regex replace : https://issues.apache.org/jira/browse/FLINK-9991 reviewing
>
> The status of most scalar functions can be seen here : 
> https://issues.apache.org/jira/browse/FLINK-6810
>
> Thanks, vino.
>
>
> 徐涛 <happydexutao@gmail.com <ma...@gmail.com>> 
> 于2018年8月23日周四 下午3:16写道:
>
>     Hi All,
>             I found flink is lack of some basic functions , for
>     example string split, regular express support, json parse and
>     extract support, these function are used frequently in development
>     , but they are not supported, use has to write UDF to support this.
>             And some of the provided functions are lack of usability,
>     for example log(2, 1.0) and exp(1.0)  with double params are not
>     supported. I think they are not hard to implement and they are
>     very basic functions.
>             Will flink enhance the basic functions , maybe in later
>     releases?
>
>     Best,
>     Henry
>


Re: lack of function and low usability of provided function

Posted by vino yang <ya...@gmail.com>.
Hi Henry,

I recently submitted some PRs about Scalar functions, some of which have
been merged and some are being reviewed, and some may be what you need.

Log2(x) :https://issues.apache.org/jira/browse/FLINK-9928 will be released
in Flink 1.7
exp(x): exists here
https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/sql.html
regular express support: use similar to , also see here
https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/sql.html
regex extract : https://issues.apache.org/jira/browse/FLINK-9990 reviewing
regex replace : https://issues.apache.org/jira/browse/FLINK-9991 reviewing

The status of most scalar functions can be seen here :
https://issues.apache.org/jira/browse/FLINK-6810

Thanks, vino.


徐涛 <ha...@gmail.com> 于2018年8月23日周四 下午3:16写道:

> Hi All,
>         I found flink is lack of some basic functions , for example string
> split, regular express support, json parse and extract support, these
> function are used frequently in development , but they are not supported,
> use has to write UDF to support this.
>         And some of the provided functions are lack of usability, for
> example log(2, 1.0) and exp(1.0)  with double params are not supported. I
> think they are not hard to implement and they are very basic functions.
>         Will flink enhance the basic functions , maybe in later releases?
>
> Best,
> Henry