You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Davies Liu (JIRA)" <ji...@apache.org> on 2016/06/17 17:57:05 UTC

[jira] [Commented] (SPARK-13753) Column nullable is derived incorrectly

    [ https://issues.apache.org/jira/browse/SPARK-13753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15336567#comment-15336567 ] 

Davies Liu commented on SPARK-13753:
------------------------------------

After discussed with [~cloud_fan], we do have runtime check to make sure that they key of Map could not be null, but do not have check on schema. So the query could fail, but should not return incorrect results.

Could you provide more on the expected result and actual results?

> Column nullable is derived incorrectly
> --------------------------------------
>
>                 Key: SPARK-13753
>                 URL: https://issues.apache.org/jira/browse/SPARK-13753
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.5.2
>            Reporter: Jingwei Lu
>            Priority: Critical
>
> There is a problem in spark sql to derive nullable column and used in optimization incorrectly. In following query:
> {code}
> select concat("perf.realtime.web", b.tags[1]) as metric, b.value, b.tags[0]
>               from (
>                 select explode(map(a.frontend[0], ARRAY(concat("metric:frontend", ",controller:", COALESCE(controller, "null"), ",action:", COALESCE(action, "null")), ".p50"),
>                                  a.frontend[1], ARRAY(concat("metric:frontend", ",controller:", COALESCE(controller, "null"), ",action:", COALESCE(action, "null")), ".p90"),
>                                  a.backend[0], ARRAY(concat("metric:backend", ",controller:", COALESCE(controller, "null"), ",action:", COALESCE(action, "null")), ".p50"),
>                                  a.backend[1], ARRAY(concat("metric:backend", ",controller:", COALESCE(controller, "null"), ",action:", COALESCE(action, "null")), ".p90"),
>                                  a.render[0], ARRAY(concat("metric:render", ",controller:", COALESCE(controller, "null"), ",action:", COALESCE(action, "null")), ".p50"),
>                                  a.render[1], ARRAY(concat("metric:render", ",controller:", COALESCE(controller, "null"), ",action:", COALESCE(action, "null")), ".p90"),
>                                  a.page_load_time[0], ARRAY(concat("metric:page_load_time", ",controller:", COALESCE(controller, "null"), ",action:", COALESCE(action, "null")), ".p50"),
>                                  a.page_load_time[1], ARRAY(concat("metric:page_load_time", ",controller:", COALESCE(controller, "null"), ",action:", COALESCE(action, "null")), ".p90"),
>                                  a.total_load_time[0], ARRAY(concat("metric:total_load_time", ",controller:", COALESCE(controller, "null"), ",action:", COALESCE(action, "null")), ".p50"),
>                                  a.total_load_time[1], ARRAY(concat("metric:total_load_time", ",controller:", COALESCE(controller, "null"), ",action:", COALESCE(action, "null")), ".p90"))) as (value, tags)
>                 from (
>                   select  data.controller as controller, data.action as action,
>                           percentile(data.frontend, array(0.5, 0.9)) as frontend,
>                           percentile(data.backend, array(0.5, 0.9)) as backend,
>                           percentile(data.render, array(0.5, 0.9)) as render,
>                           percentile(data.page_load_time, array(0.5, 0.9)) as page_load_time,
>                           percentile(data.total_load_time, array(0.5, 0.9)) as total_load_time
>                   from air_events_rt
>                   where type='air_events' and data.event_name='pageload'
>                   group by data.controller, data.action
>                 ) a
>               ) b
>               where b.value is not null
> {code}
> b.value is incorrectly derived as not nullable.  "b.value is not null" predicate will be ignored by optimizer which cause the query return incorrect result. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org