You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Huang Xingbo (Jira)" <ji...@apache.org> on 2020/12/04 08:20:00 UTC

[jira] [Updated] (FLINK-20482) Support Map Operation in Python Table API

     [ https://issues.apache.org/jira/browse/FLINK-20482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Huang Xingbo updated FLINK-20482:
---------------------------------
    Description: 
Add Map Operation in Python Table API

The usage:
{code:java}
t = ...  # type: Table, table schema: [a: String, b: Int, c: Int]

# map General Python UDF
map_func = udf(lambda x: Row(x + 1, x * x), 
          result_type=DataTypes.ROW([DataTypes.FIELD("a", DataTypes.INT()),
                                     DataTypes.FIELD("b", DataTypes.INT())]))
t.map(map_func(t.b)).alias("a", "b")

{code}

  was:
Add Map Operation in Python Table API

The usage:
{code:java}
t = ...  # type: Table, table schema: [a: String, b: Int, c: Int]

# map General Python UDF
map_func = udf(lambda x: Row(x + 1, x * x), 
          result_type=DataTypes.ROW([DataTypes.FIELD("a", DataTypes.INT()),
                                     DataTypes.FIELD("b", DataTypes.INT())]))
t.map(map_func(t.b)).alias("a", "b")

# map Pandas UDF
import pandas
pandas_map_func = udf(lambda x, y: pd.concat([x, y], axis=1),
                   result_type=DataTypes.ROW([DataTypes.FIELD("a",DataTypes.INT()),
                                          DataTypes.FIELD("b", DataTypes.INT())]))
t.map(pandas_map_func(b, c))

{code}


> Support Map Operation in Python Table API
> -----------------------------------------
>
>                 Key: FLINK-20482
>                 URL: https://issues.apache.org/jira/browse/FLINK-20482
>             Project: Flink
>          Issue Type: Sub-task
>          Components: API / Python
>            Reporter: Huang Xingbo
>            Priority: Major
>             Fix For: 1.13.0
>
>
> Add Map Operation in Python Table API
> The usage:
> {code:java}
> t = ...  # type: Table, table schema: [a: String, b: Int, c: Int]
> # map General Python UDF
> map_func = udf(lambda x: Row(x + 1, x * x), 
>           result_type=DataTypes.ROW([DataTypes.FIELD("a", DataTypes.INT()),
>                                      DataTypes.FIELD("b", DataTypes.INT())]))
> t.map(map_func(t.b)).alias("a", "b")
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)