You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Shivangi (Jira)" <ji...@apache.org> on 2021/09/09 21:15:00 UTC

[jira] [Created] (HIVE-25510) Incorrect lineage for compare expressions in select statements

Shivangi created HIVE-25510:
-------------------------------

             Summary: Incorrect lineage for compare expressions in select statements
                 Key: HIVE-25510
                 URL: https://issues.apache.org/jira/browse/HIVE-25510
             Project: Hive
          Issue Type: Bug
          Components: lineage
            Reporter: Shivangi
            Assignee: Shivangi


Incorrect lineage is generated for the queries where compare expressions are present in select statements. For example:

*`Case-when` in select statement:*

Query: 
{code:java}
select place, (case when city == "aa" then id else 0 end)/id from t1;
{code}
Corresponding Lineage:
{code:java}
{
  "edges": [
    {
      "sources": [
        2
      ],
      "targets": [
        0
      ],
      "edgeType": "PROJECTION"
    },
    {
      "sources": [
        3,
        4
      ],
      "targets": [
        1
      ],
      "expression": "(UDFToDouble(CASE WHEN ((UDFToString(t1.city) = 'aa')) THEN (t1.id) ELSE (0) END) / UDFToDouble(t1.id))",
      "edgeType": "PROJECTION"
    }
  ],
  "vertices": [
    {
      "id": 0,
      "vertexType": "COLUMN",
      "vertexId": "place"
    },
    {
      "id": 1,
      "vertexType": "COLUMN",
      "vertexId": "_c1"
    },
    {
      "id": 2,
      "vertexType": "COLUMN",
      "vertexId": "default.t1.place"
    },
    {
      "id": 3,
      "vertexType": "COLUMN",
      "vertexId": "default.t1.city"
    },
    {
      "id": 4,
      "vertexType": "COLUMN",
      "vertexId": "default.t1.id"
    }
  ]
}
{code}
Expected Lineage:
{code:java}
{
  "edges": [
    {
      "sources": [
        2
      ],
      "targets": [
        0
      ],
      "edgeType": "PROJECTION"
    },
    {
      "sources": [
        3
      ],
      "targets": [
        1
      ],
      "expression": "(UDFToDouble(CASE WHEN ((UDFToString(t1.city) = 'aa')) THEN (t1.id) ELSE (0) END) / UDFToDouble(t1.id))",
      "edgeType": "PROJECTION"
    },
    {
      "sources": [
        4
      ],
      "targets": [
        1
      ],
      "expression": "CASE WHEN ((UDFToString(t1.city) = 'aa')) THEN (t1.id) ELSE (0) END",
      "edgeType": "PREDICATE"
    }
  ],
  "vertices": [
    {
      "id": 0,
      "vertexType": "COLUMN",
      "vertexId": "place"
    },
    {
      "id": 1,
      "vertexType": "COLUMN",
      "vertexId": "_c1"
    },
    {
      "id": 2,
      "vertexType": "COLUMN",
      "vertexId": "default.t1.place"
    },
    {
      "id": 3,
      "vertexType": "COLUMN",
      "vertexId": "default.t1.id"
    },
    {
      "id": 4,
      "vertexType": "COLUMN",
      "vertexId": "default.t1.city"
    }
  ]
}
{code}
 

*`IF` statement in select statement:* 

Query:

 
{code:java}
select IF(city='aa',place,'FALSE') from t1;
{code}
Corresponding lineage:
{code:java}
{
  "edges": [
    {
      "sources": [
        1,
        2
      ],
      "targets": [
        0
      ],
      "expression": "if((UDFToString(t1.city) = 'aa'), t1.place, 'FALSE')",
      "edgeType": "PROJECTION"
    }
  ],
  "vertices": [
    {
      "id": 0,
      "vertexType": "COLUMN",
      "vertexId": "_c0"
    },
    {
      "id": 1,
      "vertexType": "COLUMN",
      "vertexId": "default.t1.city"
    },
    {
      "id": 2,
      "vertexType": "COLUMN",
      "vertexId": "default.t1.place"
    }
  ]
}{code}

Expected Lineage: 
Projection edge for target `vertex 0` should have only `vertex 2` as source and there should be one predicate edge as well, where source would be `vertex 1` and target `vertex 0`. 



 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)