You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "yisha zhou (Jira)" <ji...@apache.org> on 2020/10/14 08:29:00 UTC

[jira] [Commented] (FLINK-16627) Support only generate non-null values when serializing into JSON

    [ https://issues.apache.org/jira/browse/FLINK-16627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17213727#comment-17213727 ] 

yisha zhou commented on FLINK-16627:
------------------------------------

As new connector property keys are introduced in [Flip-122|https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory],  all format configOptions will have a prefix which is the identifier of the format factory, like 'json.fail-on-missing-field'. Therefore our 'json.json-include' will be a little bit weird in version 1.11.  Taking the discussion above into account, I believe 'json.encode.ignore-null-fields' will be a good choice. How do you think? [~jackray] [~jark] [~libenchao]

> Support only generate non-null values when serializing into JSON
> ----------------------------------------------------------------
>
>                 Key: FLINK-16627
>                 URL: https://issues.apache.org/jira/browse/FLINK-16627
>             Project: Flink
>          Issue Type: New Feature
>          Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table SQL / Planner
>    Affects Versions: 1.10.0
>            Reporter: jackray wang
>            Assignee: jackray wang
>            Priority: Major
>             Fix For: 1.12.0
>
>
> {code:java}
> //sql
> CREATE TABLE sink_kafka ( subtype STRING , svt STRING ) WITH (……)
> {code}
>  
> {code:java}
> //sql
> CREATE TABLE source_kafka ( subtype STRING , svt STRING ) WITH (……)
> {code}
>  
> {code:java}
> //scala udf
> class ScalaUpper extends ScalarFunction {    
> def eval(str: String) : String= { 
>        if(str == null){
>            return ""
>        }else{
>            return str
>        }
>     }
>     
> }
> btenv.registerFunction("scala_upper", new ScalaUpper())
> {code}
>  
> {code:java}
> //sql
> insert into sink_kafka select subtype, scala_upper(svt)  from source_kafka
> {code}
>  
>  
> ----
> Sometimes the svt's value is null, inert into kafkas json like  \{"subtype":"qin","svt":null}
> If the amount of data is small, it is acceptable,but we process 10TB of data every day, and there may be many nulls in the json, which affects the efficiency. If you can add a parameter to remove the null key when defining a sinktable, the performance will be greatly improved
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)