You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Igor Guzenko (JIRA)" <ji...@apache.org> on 2018/11/14 15:11:00 UTC

[jira] [Updated] (DRILL-540) Allow querying hive views in drill

     [ https://issues.apache.org/jira/browse/DRILL-540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Igor Guzenko updated DRILL-540:
-------------------------------
    Description: 
Currently hive views cannot be queried from drill.

*Suggested approach*
 # Drill persists it's views metadata in file with suffix .view.drill using json format. For example: 

{noformat}
{
 "name" : "view_from_calcite_1_4",
 "sql" : "SELECT * FROM `cp`.`store.json`WHERE `store_id` = 0",
 "fields" : [ {
 "name" : "*",
 "type" : "ANY",
 "isNullable" : true
 } ],
 "workspaceSchemaPath" : [ "dfs", "tmp" ]
}{noformat}
        Later drill parses the metadata and uses it to treat view names in SQL as a subquery.

      2. In Apache Hive metadata about views is stored in similar way to tables. Below is example from metastore.TBLS :

 
{noformat}
TBL_ID |CREATE_TIME |DB_ID |LAST_ACCESS_TIME |OWNER |RETENTION |SD_ID |TBL_NAME  |TBL_TYPE      |VIEW_EXPANDED_TEXT                         |
-------|------------|------|-----------------|------|----------|------|----------|--------------|-------------------------------------------|
2      |1542111078  |1     |0                |mapr  |0         |2     |cview     |VIRTUAL_VIEW  |SELECT COUNT(*) FROM `default`.`customers` |{noformat}
      3. So in Hive metastore views are considered as tables of special type. And main benefit is that we also have expanded SQL definition of views (just like in view.drill files). Also reading of the metadata is already implemented in Drill with help of thrift Metastore API.

      4. To enable querying of Hive views I'll reuse existing code for Drill views as much as possible. First in *_HiveSchemaFactory.getDrillTable_* for _*HiveReadEntry*_ I'll convert the metadata to instance of _*View*_ (_which is actually model for data persisted in .view.drill files_) and then based on this instance return new _*DrillViewTable*_. Using this approach drill will handle hive views the same way as if it was initially defined in Drill and persisted in .view.drill file. 

     5. For conversion of Hive types: from _*FieldSchema*_ to _*RelDataType*_ I'll reuse existing code from _*DrillHiveTable*_, so the conversion functionality will be extracted and used for both (table and view) fields type conversions. 

 

  was:
Currently hive views cannot be queried from drill.



> Allow querying hive views in drill
> ----------------------------------
>
>                 Key: DRILL-540
>                 URL: https://issues.apache.org/jira/browse/DRILL-540
>             Project: Apache Drill
>          Issue Type: New Feature
>          Components: Storage - Hive
>            Reporter: Ramana Inukonda Nagaraj
>            Assignee: Igor Guzenko
>            Priority: Major
>              Labels: doc-impacting
>             Fix For: 1.16.0
>
>
> Currently hive views cannot be queried from drill.
> *Suggested approach*
>  # Drill persists it's views metadata in file with suffix .view.drill using json format. For example: 
> {noformat}
> {
>  "name" : "view_from_calcite_1_4",
>  "sql" : "SELECT * FROM `cp`.`store.json`WHERE `store_id` = 0",
>  "fields" : [ {
>  "name" : "*",
>  "type" : "ANY",
>  "isNullable" : true
>  } ],
>  "workspaceSchemaPath" : [ "dfs", "tmp" ]
> }{noformat}
>         Later drill parses the metadata and uses it to treat view names in SQL as a subquery.
>       2. In Apache Hive metadata about views is stored in similar way to tables. Below is example from metastore.TBLS :
>  
> {noformat}
> TBL_ID |CREATE_TIME |DB_ID |LAST_ACCESS_TIME |OWNER |RETENTION |SD_ID |TBL_NAME  |TBL_TYPE      |VIEW_EXPANDED_TEXT                         |
> -------|------------|------|-----------------|------|----------|------|----------|--------------|-------------------------------------------|
> 2      |1542111078  |1     |0                |mapr  |0         |2     |cview     |VIRTUAL_VIEW  |SELECT COUNT(*) FROM `default`.`customers` |{noformat}
>       3. So in Hive metastore views are considered as tables of special type. And main benefit is that we also have expanded SQL definition of views (just like in view.drill files). Also reading of the metadata is already implemented in Drill with help of thrift Metastore API.
>       4. To enable querying of Hive views I'll reuse existing code for Drill views as much as possible. First in *_HiveSchemaFactory.getDrillTable_* for _*HiveReadEntry*_ I'll convert the metadata to instance of _*View*_ (_which is actually model for data persisted in .view.drill files_) and then based on this instance return new _*DrillViewTable*_. Using this approach drill will handle hive views the same way as if it was initially defined in Drill and persisted in .view.drill file. 
>      5. For conversion of Hive types: from _*FieldSchema*_ to _*RelDataType*_ I'll reuse existing code from _*DrillHiveTable*_, so the conversion functionality will be extracted and used for both (table and view) fields type conversions. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)