You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@griffin.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2019/08/22 08:13:00 UTC

[jira] [Work logged] (GRIFFIN-266) [Service] Measure's rules are not always properly sorted

     [ https://issues.apache.org/jira/browse/GRIFFIN-266?focusedWorklogId=299260&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-299260 ]

ASF GitHub Bot logged work on GRIFFIN-266:
------------------------------------------

                Author: ASF GitHub Bot
            Created on: 22/Aug/19 08:12
            Start Date: 22/Aug/19 08:12
    Worklog Time Spent: 10m 
      Work Description: neveljkovic commented on pull request #525: Task GRIFFIN-266: add sorting to EvaluateRule
URL: https://github.com/apache/griffin/pull/525
 
 
   This is how we fixed postgres order by issue described in https://issues.apache.org/jira/browse/GRIFFIN-266
   Solution is deployed to our servers and works OK.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

            Worklog Id:     (was: 299260)
    Remaining Estimate: 0h
            Time Spent: 10m

> [Service] Measure's rules are not always properly sorted
> --------------------------------------------------------
>
>                 Key: GRIFFIN-266
>                 URL: https://issues.apache.org/jira/browse/GRIFFIN-266
>             Project: Griffin
>          Issue Type: Bug
>    Affects Versions: 0.5.0
>            Reporter: Nevena Veljkovic
>            Assignee: Kevin Yao
>            Priority: Major
>             Fix For: 0.6.0
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> If measure has more than one rule, which is common practice for dsl.type spark-sql, it could happen that measure's rules are not sorted correctly which results in job failing.
> Example:
> GET measure by id returns rules sorted in this order: 3005, 3006 and then 3004 (it should be 3004, 3005, 3006)
> {code:java}
> {
>         "id": 3005,
>         "rule": "SELECT count(*) as incomplete FROM source WHERE (node_metrics_pk IS NULL) OR (node_master_fk IS NULL) OR (location_id IS NULL) OR (freq_band IS NULL) OR (ts IS NULL) ",
>         "dsl.type": "spark-sql",
>         "dq.type": null,
>         "out.dataframe.name": "incomplete_count",
>         "out": [
>             \{
>                 "type": "record",
>                 "name": "incomplete_count"
>             },
>             \{
>                 "type": "metric",
>                 "name": "incomplete_count"
>             }
>         ]
>     },
>     \{
>         "id": 3006,
>         "rule": "SELECT (total - incomplete) AS complete FROM total_count LEFT JOIN incomplete_count",
>         "dsl.type": "spark-sql",
>         "dq.type": null,
>         "out.dataframe.name": "complete_count",
>         "out": [
>             {
>                 "type": "metric",
>                 "name": "complete_count"
>             }
>         ]
>     },
>     \{
>         "id": 3004,
>         "rule": "SELECT COUNT(*) AS total FROM source",
>         "dsl.type": "spark-sql",
>         "dq.type": null,
>         "out.dataframe.name": "total_count",
>         "out": [
>             {
>                 "type": "record",
>                 "name": "total_count"
>             },
>             \{
>                 "type": "metric",
>                 "name": "total_count"
>             }
>         ]
>     }
> {code}
>  
> Griffin job fails with error:
> {code:java}
> 19/07/11 11:00:31 ERROR transform.SparkSqlTransformStep: run spark sql [ SELECT (total - incomplete) AS complete FROM total_count LEFT JOIN incomplete_count ] error: Table or view not found: total_count; line 1 pos 45
> org.apache.spark.sql.AnalysisException: Table or view not found: total_count
> {code}
> As we see execution of rule 3005 fails because rule 3004 is not executed yet (due to incorrect sorting).
> Measure's entity EvaluateRule.java does not have a sorting:
>  [https://github.com/apache/griffin/blob/master/service/src/main/java/org/apache/griffin/core/measure/entity/EvaluateRule.java#L32-L38]
> According to Postgresql documentation: [https://www.postgresql.org/docs/9.3/sql-select.html]
>  If the ORDER BY clause is specified, the returned rows are sorted in the specified order.
>  If ORDER BY is not given, the rows are returned in whatever order the system finds fastest to produce.
> Proposed solution here is to set sorting in EvaluateRule.java.
>   



--
This message was sent by Atlassian Jira
(v8.3.2#803003)