You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@griffin.apache.org by "Yao WANG (Jira)" <ji...@apache.org> on 2019/12/24 06:59:00 UTC
[jira] [Commented] (GRIFFIN-311) keep null value columns optionally

    [ https://issues.apache.org/jira/browse/GRIFFIN-311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002692#comment-17002692 ] 

Yao WANG commented on GRIFFIN-311:
----------------------------------

Actually the result dataframe keep the null value in the internal row,
But when you output  the result using the json format,
the griffin will use the JacksonGenerator to output the record 
and  the JacksonGenerator will skip the field of the null value when you output the record
as you can see the source code of JacksonGenerator :


{code:java}
  private def writeFields(
      row: InternalRow, schema: StructType, fieldWriters: Seq[ValueWriter]): Unit = {
    var i = 0
    while (i < row.numFields) {
      val field = schema(i)
      if (!row.isNullAt(i)) {
        gen.writeFieldName(field.name)
        fieldWriters(i).apply(row, i)
      }
      i += 1
    }
  }
{code}


> keep null value columns optionally
> ----------------------------------
>
>                 Key: GRIFFIN-311
>                 URL: https://issues.apache.org/jira/browse/GRIFFIN-311
>             Project: Griffin
>          Issue Type: New Feature
>            Reporter: XIAOYU YU
>            Priority: Major
>
> Hi all, 
>  Our team are facing a confused issue that griffin's output could not keep the null value columns. 
>  Our spark version is 2.2. For the following table xxx.xxxx and rule: 
> |name|age|date|
> |NULL|12|2018-11-11|
> |sizhe|NULL|2018-11-11|
> rule: 
> {code:java}
> "rule" : "select name, age from xxx.xxxx", 
> "out" : {"type" : "metric", "flatten" : "array"} 
> "dsl.type" : "spark-sql", "out.dataframe.name" : "payloads"{code}
> Griffin result would be as follows, which ignore the null value and null line.
> {code:java}
>  [{"age": 12}, {"name": "sizhe"}]  {code}
> It would confuse some users, when they found some columns missed in result and the expected result for them should be:
> {code:java}
>  [{"name": null, "age": 12}, {"name": "sizhe", "age": null}]  {code}
>  
>  And for sql: 
> {code:java}
> "rule" : "select name, age from xxx.xxxx where date = '2019-01-01'",  {code}
> Griffin result would be []. But the expected result for them should be
> {code:java}
> [{"name": null, "age": null}]{code}
> , which contains a null line.
>   
>  Could griffin add a new feature to keep the null value columns optionally to make these users happy?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)